3D Skeleton-Based Human Motion Prediction Using Dynamic Multi-Scale Spatiotemporal Graph Recurrent Neural Networks
Abstract: A dynamic multi-scale spatiotemporal graph recurrent neural network (DMST-GRNN) model has been introduced,
which is leveraged to use human motion prediction on a 3D
skeleton-based human activity dataset. It offers a multi-scale approach to spatial & temporal graphs using multi-scale graph convolution units (MGCUs) to describe the human body’s semantic interconnection. The proposed DMST-GRNN is an encoder-decoder
framework where a series of MGCUs are used as encoders to learn
spatiotemporal features and a novel graph-gated recurrent unit lite
(GGRU-L) for the decoder to predict human pose. Extensive experiments have been carried out with two datasets, Human3.6M and
CMU Mocap, where both short and long videos were considered
to validate the performance of the proposed model. The DMSTGRNN model outperforms the existing baseline on the Human3.6M
datasets by 11.95% and 7.74% of average mean angle errors (avg
MAE) for short and long-term motion prediction, respectively.
Similarly, CMU Mocap datasets, the DMST-GRNN model predicts
future posture more accurately than the previous best approaches
by 2.77% and 5.51% of average mean angle errors (avg MAE) for
short and long-term motion prediction, respectively. A comparison
analysis was also presented with other measures like mean angle
error, prediction loss and standard deviation. A separate discussion
has been included to analyze the effect of different multiscale on
spatial and temporal graphs, along with the impact of MGCU unit
counts.
0 Replies
Loading