HmPEAR: A Dataset for Human Pose Estimation and Action Recognition

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: We introduce HmPEAR, a novel dataset crafted for advancing research in 3D Human Pose Estimation (3D HPE) and Human Action Recognition (HAR), with a primary focus on outdoor environments. This dataset offers a synchronized collection of imagery, LiDAR point clouds, 3D human poses, and action categories. In total, the dataset encompasses over 300,000 frames collected from 10 distinct scenes and 25 diverse subjects. Among these, 250,000 frames of data contain 3D human pose annotations captured using an advanced motion capture system and further optimized for accuracy. Furthermore, the dataset annotates 40 types of daily human actions, resulting in over 6,000 action clips. Through extensive experimentation, we have demonstrated the quality of HmPEAR and highlighted the challenges it presents to current methodologies. Additionally, we propose straightforward baselines leveraging sequential images and point clouds for 3D HPE and HAR, which underscore the mutual reinforcement between them, highlighting the potential for cross-task synergies.
Primary Subject Area: [Content] Multimodal Fusion
Relevance To Conference: This work presents a novel multimodal dataset designed for 3D human pose estimation(3D HPE) and human action recognition(HAR). It provides a synchronized collection of imagery, LiDAR point clouds, 3D human poses, and action categories. Additionally, it introduces a multimodal baseline approach that leverages cross-task synergies between 3D HPE and HAR.
Supplementary Material: zip
Submission Number: 2339
Loading