Abstract: Human pose and shape estimation is crucial in perceiving gestures and actions intended by the human. It is necessary to perform this task in real-time so that consistency and safety can be ensured in practical settings. It is also important that this task is executable on practically affordable hardware. We thereby come up with an integrated pipeline for robustly perceiving human poses and shapes on an edge device. The pipeline is a combination of YOLOv8, AlphaPose, MotionBERT, and ExPose. Using these models, the pipeline performs bounding box detection, 2D skeleton estimation, 2D-to-3D pose lifting, and 3D mesh reconstruction, respectively. Results of the related experiments show that our pipeline can accurately and precisely process input images in real-time on the Jetson TX, a typical example of an edge device.
Loading