Semi-supervised Learning for Detector-free Multi-person Pose Estimation

Haixin Wang, Lu Zhou, Yingying Chen, Ming Tang, Jinqiao Wang

Published: 16 Aug 2025, Last Modified: 09 Nov 2025Machine Intelligence ResearchEveryoneRevisionsCC BY-SA 4.0
Abstract: Semi-supervised learning is a significant approach to learn robust human pose estimation models that perform well on wild images. Existing semi-supervised methods of human pose estimation mainly focus on instance-agnostic keypoint detection. In multi-person scenes, the arbitrary number of instances that have made pose estimation much more challenging, and current semi-supervised methods cannot fully mine the information in unlabeled data. To leverage the instance information in unlabeled data, we propose an end-to-end semi-supervised training strategy. Different from previous semi-supervised methods in two stages, our method focuses on detector-free frameworks including bottom-up and single-stage ones. It not only performs consistency regularization on heatmaps, but also employs a pseudo-labeling approach to generate instance-specific pseudo annotations. On the COCO and CrowdPose benchmark, the proposed approach outperforms previous instance-agnostic methods under various labeling ratios. Our method is applicable to both bottom up and single-stage frameworks, showing its general applicability.
Loading