Abstract: In the task of dynamic human pose estimation (dynamic HPE), the temporal relationships between human body parts should be captured comprehensively to understand the dynamic human motions, where the correlated motion information eventually helps to recognize body parts. The popular methods are successful in terms of utilizing long-term motion information captured by low-speed cameras. Yet they neglect the underlying intermediate motions between captured frames, which comprise the temporal-interim poses lost in the video. In this article, we introduce a novel framework, temporal-interim pose synthesis and distillation, to produce and leverage the intermediate motion information for dynamic motion establishment. The pose synthesis yields the visual feature maps of the intermediate poses, which appear between the existing video frames. It allows the synthesized and current poses to form richer motion patterns. Next, the pose distillation divides the body parts into several groups, where it learns the specific part-wise relationship within each group. It degrades the complexity of learning useful part-wise relationships from rich motion patterns and extracts more detailed motion information for fine-grained part groups. We extensively evaluate our method on challenging datasets for dynamic pose estimation, achieving state-of-the-artresults.
External IDs:dblp:journals/tnn/ZhangLWLSBCL25
Loading