3D Skeleton-Based Human Motion Prediction Using Dynamic Multi-Scale Spatiotemporal Graph Recurrent Neural Networks

Mayank Lovanshi, Vivek Tiwari

29 Oct 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: A dynamic multi-scale spatiotemporal graph recurrent neural network (DMST-GRNN) model has been introduced, which is leveraged to use human motion prediction on a 3D skeleton-based human activity dataset. It offers a multi-scale approach to spatial & temporal graphs using multi-scale graph convolution units (MGCUs) to describe the human body’s semantic interconnection. The proposed DMST-GRNN is an encoder-decoder framework where a series of MGCUs are used as encoders to learn spatiotemporal features and a novel graph-gated recurrent unit lite (GGRU-L) for the decoder to predict human pose. Extensive experiments have been carried out with two datasets, Human3.6M and CMU Mocap, where both short and long videos were considered to validate the performance of the proposed model. The DMSTGRNN model outperforms the existing baseline on the Human3.6M datasets by 11.95% and 7.74% of average mean angle errors (avg MAE) for short and long-term motion prediction, respectively. Similarly, CMU Mocap datasets, the DMST-GRNN model predicts future posture more accurately than the previous best approaches by 2.77% and 5.51% of average mean angle errors (avg MAE) for short and long-term motion prediction, respectively. A comparison analysis was also presented with other measures like mean angle error, prediction loss and standard deviation. A separate discussion has been included to analyze the effect of different multiscale on spatial and temporal graphs, along with the impact of MGCU unit counts.

0 Replies