Keywords: Monocular 3D Human Pose Estimation; Multi-Task Learning
TL;DR: We propose a novel progressive multi-task learning framework for monocular 3D human pose estimation.
Abstract: The lifting-based framework has dominated the field of monocular 3D human pose estimation by leveraging the well-detected 2D pose as an intermediate representation. However, it neglects the different initial states between 2D pose and per-joint depth and encodes the well-detected 2D pose feature and unknown per-joint depth feature in an entangled feature space. To address this limitation, we present a novel progressive multi-task learning pose estimation framework named PrML. Firstly, PrML introduces two task branches: one is to refine the well-detected 2D pose feature and the other is to learn the per-joint depth feature. This dual-branch design reduces the explicit influence of uncertain depth features on 2D pose features. Secondly, PrML employs a task aware decoder to supplement the complementary information between the refined 2D pose feature and the well-learned per-joint depth feature. This step establishes the connection between 2D pose and per-joint depth, compensating for the lack of interaction caused by the dual-branch design. We also conduct theoretical analysis from the perspective of mutual information and arrive at a loss to constrain this feature complementary process. Finally, we use two regression heads to regress the 2D pose and per-joint depth respectively, and concatenate them to obtain the final 3D pose. Extensive experiments show that PrML outperforms the conventional lifting-based framework with fewer parameters on two widely used datasets: Human3.6M and MPI-INF-3DHP. Code is available at https://anonymous.4open.science/r/PrML and we hope our effort can provide a new framework for monocular 3D human pose estimation.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2366
Loading