Abstract: Recently, skeleton-based human action recognition has received widespread attention in computer vision community. However, most existing research focuses on improving the recognition accuracy on complete skeleton data, while ignoring the performance on the incomplete skeleton data with occlusion or noise. This paper addresses occluded and noise-robust skeleton-based action recognition and presents a novel Dual Inhibition Training strategy. Specifically, we propose Part-aware and Dual-inhibition Graph Convolutional Network (PDGCN), which comprises of three parts: Input Skeleton Inhibition (ISI), Part-Aware Representation Learning (PARL) and Predicted Score Inhibition (PSI). The ISI and PSI are plug and play modules which could encourage the model to learn discriminative features from diversified body joints by effectively simulating key body part occlusions and random occlusions. The PARL module learns both the global and local representations from the whole body and body parts, respectively, and progressively fuses them during representation learning to enhance the model robustness under occlusions. Finally, we design different settings for occluded skeleton-based human action recognition to deep study this problem and better evaluate different approaches. Our approach achieves state-of-the-art results on different benchmarks and dramatically outperforms the recent skeleton-based action recognition approaches, especially under large-scale temporal occlusion.
Loading