Estimating Egocentric 3D Human Pose in the Wild with External Weak Supervision

Jian Wang, Lingjie Liu, Weipeng Xu, Kripasindhu Sarkar, Diogo Luvizon, Christian Theobalt

02 Nov 2022OpenReview Archive Direct UploadReaders: Everyone

Abstract: Egocentric 3D human pose estimation with a single fish- eye camera has drawn a significant amount of attention re- cently. However, existing methods struggle with pose esti- mation from in-the-wild images, because they can only be trained on synthetic data due to the unavailability of large- scale in-the-wild egocentric datasets. Furthermore, these methods easily fail when the body parts are occluded by or interacting with the surrounding scene. To address the shortage of in-the-wild data, we collect a large-scale in-the- wild egocentric dataset called Egocentric Poses in the Wild (EgoPW). This dataset is captured by a head-mounted fish- eye camera and an auxiliary external camera, which pro- vides an additional observation of the human body from a third-person perspective during training. We present a new egocentric pose estimation method, which can be trained on the new dataset with weak external supervision. Specifi- cally, we first generate pseudo labels for the EgoPW dataset with a spatio-temporal optimization method by incorporat- ing the external-view supervision. The pseudo labels are then used to train an egocentric pose estimation network. To facilitate the network training, we propose a novel learn- ing strategy to supervise the egocentric features with the high-quality features extracted by a pretrained external- view pose estimation model. The experiments show that our method predicts accurate 3D poses from a single in-the- wild egocentric image and outperforms the state-of-the-art methods both quantitatively and qualitatively.

0 Replies