Attention-Based Fusion of Directed Rotation Graphs for Skeleton-Based Dynamic Hand Gesture Recognition
Abstract: Recent works on skeleton-based hand gesture recognition proposed graph-based representation of hand skeleton. In this paper, we propose a new attention-based directed rotation graph fusion method for skeleton-based hand gesture recognition. First, we present a novel double-stream directed rotation graph feature to jointly capture the spatiotemporal dynamics and hand structural information. We utilize the bone direction and rotation information to model the kinematic dependency and relative geometry in hand skeleton. The spatial stream employs a spatial directed rotation graph (SDRG) containing joint position and rotation information to model spatial dependencies between joints. The temporal stream employs a temporal directed rotation graph (TDRG), containing joint displacement and rotation between frames to model temporal dependencies. We design a new attention-based double-stream fusion framework ADF-DGNN, in which the two streams are fed into two directed graph neural networks (DGNNs), and the encoded graphs are concatenated and fused by a fusion module with multi-head attention to generate expressive and discriminative characteristics for identifying hand gesture. The experiments on DHG-14/28 dataset demonstrate the effectiveness of the components of the proposed method and its superiority compared with state-of-the-art methods.
Loading