Abstract: Compared to the prediction for individuals, motion prediction in 3D scenes remains a challenging task, often requiring guidance on both historical motion and the surrounding environment. However, the issue of how to effectively introduce scene context into human motion prediction remains unexplored. In this paper, we propose a novel scene-aware motion prediction method that formulates the RGBD scene data through multi-view perception, while predicting the human motion that matches the scene. First, from the top view, we perform a global path planning for motion trajectory based on scene context information. Then, from the 2D view, the semantic features of the scene are extracted from image sequences and fused with the human motion features to learn the potential interaction between the scene and the motion intention. Finally, a path-guided motion prediction framework is proposed to infer the final motion of human in the 3D view. We evaluate the effectiveness of the proposed method on two challenging datasets, including both synthetic and real environments. Experimental results demonstrate that the proposed method achieves the state-of-the-art motion prediction performance in complex scenes.
Loading