Abstract: Graph convolutional network (GCN) has been widely used by skeleton based action recognition algorithms and achieves remarkable performance. However, recent GCN based State-Of-The-Art (SOTA) models for skeleton based action recognition tend to become increasingly sophisticated and over-parameterized. The low efficiency in model training and inference poses a challenge for their practical implementation in real-world scenarios. To address this issue, we construct a GCN based lightweight model for skeleton based action recognition, termed LightGCN. In this work we introduce an efficient convolutional neural network (CNN) structure to our temporal convolutional (TC) layer to extract temporal dynamics, effectively reducing model complexity. Furthermore, we propose a novel attention module that first extends the multi-spectral channel attention mechanism to the field of skeleton based action recognition, which preserves not only the lowest frequency information, but also useful information encoded by other frequency components, reducing the information loss during the channel compression. In order to further reduce the model complexity, we design a new compound scaling strategy to expand the model’s width and depth to different extent. This strategy enables the model to achieve an excellent balance between complexity and accuracy. On the two large-scale datasets, i.e., NTU RGB+D 60 and 120, our proposed LightGCN achieves 92.5% accuracy on the cross-subject benchmark of NTU 60 dataset, outperforming previous SOTA lightweight models and most heavyweight models, while needing 24.54% fewer parameters and 24.76% fewer flops than EfficientGCN-B4, which is the SOTA lightweight model.
Loading