Frequency-Domain Transformation-Based Dynamic Gesture Recognition with Skeleton

Xiang Liu, Chuankun Li, Shuai Li, Wanqing Li, Danyan Xie

Published: 2024, Last Modified: 13 Mar 2026PRCV (3) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Graph convolutional networks (GCNs) have been widely used in skeleton-based hand gesture recognition due to strong ability in mining non-Euclidean features. However, GCNs cannot effectively extract long temporal information. To address this issue, this paper proposes a frequency-domain auxiliary neural network. Hand gestures are recognized by analyzing temporal features in the frequency domain and combining spatial attention graph convolutional network. The proposed network adopts a two-stream architecture. The one stream is a spatial attention graph convolutional network, which uses spatial attention and shift convolution to adaptively exploit relationships of all hand joints. The other stream is a frequency domain graph convolutional network, which extracts temporal features from the frequency domain for hand gesture recognition. The score fusion is utilized for two streams to improve performance. The effectiveness of the proposed method is validated using the Dynamic Hand Gesture dataset and the First-Person Hand Action dataset, and our method achieves state-of-the-art performance.
Loading