Towards Efficient 3D Human Motion Prediction using Deformable Transformer-based Adversarial Network

Published: 01 Jan 2022, Last Modified: 13 Nov 2024ICRA 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Human motion prediction is a crucial step for achieving human-robot interactions. While recent transformer-based methods have shown great potentials in 3D human motion prediction, they still suffer from mode collapse to non-plausible poses and quadratically computational complexity with respect to the increasing length of input sequences. In this paper, we propose a novel spatio-temporal deformable transformer-based adversarial network (STDTA) for 3D human motion prediction. First, we design a spatio-temporal deformable transformer module to capture the correlations between human joints while reducing the computational costs. Second, we introduce the adversarial training mechanism and design fidelity and continuity discriminators to maintain smoothness and stability for the long-term prediction. Finally, extensive experiments on Human 3.6M and AMASS benchmarks demonstrate that the proposed STDTA achieves state-of-the-art performance.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview