Abstract: Skeleton sequences for action recognition are with complex temporal dynamics due to various factors such as speed variation and different activities. It is crucial and essential to model variation changes in the temporal dimension. In recent years, skeleton sequence is always modeled as a graph structure, and Graph Convolution Network (GCN) is employed to extract spatial and temporal features of actions. Though GCN has obtained great achievements, they typically employ fixed-size temporal kernels for temporal modeling, which ignore the complex temporal dynamic of actions, especially for long-term as well as short-term modeling. To capture this complex motion pattern effectively, we propose a Dual Multi-Scale Graph Convolutional Network (DMS-GCN), which is mainly composed of a Deformable Temporal Kernel (DTK) block and a dual multi-scale strategy. Specifically, the DTK block is proposed to flexibly capture complex temporal information of the skeleton sequence. And the dual multi-scale strategy is used to simultaneously accommodate long-term and short-term dynamic information at different scales globally as well as locally. The effectiveness of our proposed method is verified through experiments conducted on two widely used datasets, NTU-RGB+D 60 and NTU-RGB+D 120.
External IDs:dblp:conf/icassp/LiZCWZLL25
Loading