DIDA: Dynamic Individual-to-integrateD Augmentation for Self-supervised Skeleton-Based Action Recognition

Haobo Huang, Jianan Li, Hongbin Fan, Zhifu Zhao, Yangtao Zhou

Published: 01 Jan 2024, Last Modified: 15 May 2025PRCV (7) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Self-supervised action recognition plays a crucial role by enabling machines to understand and interpret human actions without the need for numerous human-annotated labels. Contrastive learning, which compels the model to focus on discriminative features by constructing positive and negative sample pairs, is a highly effective method for achieving self-supervised action recognition. In contrastive learning, existing models focus on designing various augmentation methods and simply applying a fixed combination of these augmentations to generate the sample pairs. Nevertheless, there are two primary concerns associated with these methods: (1) The contentious strong augmentation could distort the structure of skeleton data and lead to semantic distortion. (2) Existing methods often apply augmentations uniformly, ignoring the unique characteristics of each augmentation technique. To address these problems, we propose the Dynamic Individual-to-integrateD Augmentation (DIDA) framework. This framework is designed with an innovative dual-phase structure. In the first phase, a close-loop feedback structure is applied to handle each augmentation separately and adjust their intensities dynamically based on immediate results. In the second phase, individual-to-integrated augmentation strategy with multi-level contrastive learning is designed to further enhance the feature representation ability of the model. Extensive experiments show that the proposed DIDA outperforms current state-of-the-art methods on the NTU60 and NTU120 datasets.