PAS-Net: Pose-based and Appearance-based Spatiotemporal Networks Fusion for Action Recognition

Changzhen Li, Jie Zhang, Shiguang Shan, Xilin Chen

2020 (modified: 16 Nov 2022)FG 2020Readers: Everyone

Abstract: Human poses play important roles in action analysis. However, most state-of-the-art approaches in action recognition ignore the importance of human poses and rarely leverage the pose information for further improving the recognition performance. In this paper, we propose a novel network architecture, which simultaneously considers the appearance information and pose knowledge for robust action recognition. We explore various architectures for fusing the appearance and pose information rather than simply averaging scores at the final layer. Moreover, a novel training strategy is proposed to reduce the influence of overfitting for limited training data. Extensive experiments show that our method achieves competitive performance on the popular benchmarks, i.e., UCF-101 and HMDB-51.

0 Replies