Scale-Aware Graph Convolutional Network With Part-Level Refinement for Skeleton-Based Human Action Recognition
Abstract: Graph Convolutional Networks (GCNs) have been widely used in skeleton-based human action recognition and have achieved promising results. However, current GCN-based methods are limited by their inability to refine semantic-guided joint relations and perform adaptive multi-scale analysis. These limitations impair their performance, particularly for analogical actions involving the interaction of the same body parts (e.g., drinking water and eating) as well as deficient actions with limited spatial-temporal information (e.g., subtle action writing and transient action sneezing). To solve these problems, we propose Part-level Refined Spatial Graph Convolution (PR-SGC) and Scale-aware Temporal Graph Convolution (Sa-TGC) for optimal action representation. The PR-SGC divides the skeleton into body parts and embeds this high-level semantics to refine the physical adjacency matrix. The Sa-TGC leverages the dynamic scale-aware mechanism to extract context-dependent multi-scale features. On this basis, we develop a novel Scale-aware Graph Convolutional Network with Part-level Refinement (SaPR-GCN), which is on par with state-of-the-art benchmarks on NTU RGB+D 60, NTU RGB+D 120, and NW-UCLA datasets.
Loading