Human Pose Estimation is the task of estimating the pose of a person from an image or a video by localizing the different keypoints on the person. The keypoints are the different joints on the body, such as the elbow, shoulder, hip, knee, etc. The pose estimation problem can be represented as a regression problem to fit a model that predicts the location of these keypoints given an image. The keypoints can be represented as a set of points (x,y) coordinates.

The number of keypoints varies by dataset. For example, the LSP dataset has 14 keypoints, the MPII dataset has 16 keypoints, and the COCO dataset has 17 keypoints

Pose Estimation can be broadly classified into two categories:

**2D Pose Estimation**: Estimate (x,y) coordinates for each joint in pixel space from RGB input.**3D Pose Estimation**: Estimate (x,y,z) coordinates in metric space from RGB input.

This presentation by Wei Yang provides an informative roadmap on 2D Human Pose Estimation research.