Exploiting 3D Human Recovery for Action Recognition with Spatio-Temporal Bifurcation Fusion

Published: 01 Jan 2023, Last Modified: 12 Apr 2025ICASSP 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Action recognition utilizes information in images or videos to analyze and classify human behaviors. The existing methods usually exploit 2D pose to improve classification features. Due to the lack of 3D cues, some approximate behaviors in 2D perspective cannot be recognized. In this paper, we propose a novel action recognition method with 3D human recovery and spatio-temporal bifurcations fusion. It consists of 3D spatial branch and 2D temporal branch. The 3D spatial branch exploits overlapping human models from 3D recovery to learn 3D recognition feature. The 2D temporal branch utilizes channel attention mechanism to enhance dynamic associated feature. Two types of features are fused by adaptive weights module, in order to improve the recognition of approximate behavior in 2D perspective. Extensive experiments show that this proposed method outperforms most of the state-of-the-art methods on the Olympic Sport, Diving48 and Human3.6M datasets.
Loading