A Unified Framework for Human-centric Point Cloud Video Understanding

Yiteng Xu, Kecheng Ye, Xiao Han, Yiming Ren, Xinge Zhu, Yuexin Ma

Published: 2024, Last Modified: 13 Nov 2024CVPR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Human-centric Point Cloud Video Understanding (PVU) is an emerging field focused on extracting and interpreting human-related features from sequences of human point clouds, further advancing downstream human-centric tasks and applications. Previous works usually focus on tack-ling one specific task and rely on huge labeled data, which has poor generalization capability. Considering that hu-man has specific characteristics, including the structural semantics of human body and the dynamics of human motions, we propose a unified framework to make full use of the prior knowledge and explore the inherent features in the data itself for generalized human-centric point cloud video understanding. Extensive experiments demonstrate that our method achieves state-of-the-art performance on various human-related tasks, including action recognition and 3D pose estimation. All datasets and code will be re-leased soon.