HiPoser: 3D Human Pose Estimation with Hierarchical Shared Learning at Parts-Level Using Inertial Measurement Units

Published: 01 Jan 2025, Last Modified: 19 May 2025AAAI 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper considers the challenging problem of 3D Human Pose Estimation (HPE) from a sparse set of Inertial Measurement Units (IMUs). Existing efforts typically reconstruct a pose sequence by either directly tackling whole-body motions or focusing on distinctive spatio-temporal features of local body parts. Unfortunately, these methods ignore existing interdependent motor synergies amongst body parts, which may lead to pose estimation with ambiguous local parts. This observation motivates us to propose a hierarchical learning-based approach, HiPoser, which utilizes a hierarchical shared structure using Mamba blocks as the backbone to focus on the following estimation tasks, involving: 1) torso pose, 2) lower limbs pose, 3) upper limbs pose, and finally 4) global translation. These tasks selectively incorporate body motion states and are to be carried out sequentially in reconstructing part-based poses, which are amalgamated to estimate the final full-body pose with the global translation that satisfies inter-part consistencies. Our hierarchical structure allows HiPoser the flexibility in prioritizing different aspects of pose estimation, to emphasize more on detail or stability. Empirical evaluations over three benchmark datasets demonstrate the superiority of HiPoser over existing state-of-the-art models, suggesting that analyzing the synergistic movement of body parts is indeed important for advancing IMU-based 3D HPE.
Loading