MaskCLR: Multi-Level Contrastive Learning for Robust Skeletal Action Recognition

19 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Transformers, Skeleton-based Action Recognition, Contrastive Learning
TL;DR: A novel approach for improving the accuracy, robustness, and generalisation of transformer-based skeletal action recognition.
Abstract: Current transformer-based skeletal action recognition models focus on a limited set of joints and low-level motion patterns to predict action classes. This results in significant performance degradation under small skeleton perturbations or changing the pose estimator between training and testing. In this work, we introduce MaskCLR, a new Masked Contrastive Learning approach for Robust skeletal action recognition. We propose a Targeted Masking (TM) strategy to occlude the most important joints and encourage the model to explore a larger set of discriminative joints. Furthermore, we propose a Multi-Level Contrastive Learning (MLCL) paradigm to enforce feature embeddings of standard and occluded skeletons to be class-discriminative, i.e, more compact within each class and more dispersed across different classes. Our approach helps the model capture the high-level action semantics instead of low-level joint variations, and can be seamlessly incorporated into transformer-based models. Without loss of generality, we apply our method on Spatial-Temporal Multi-Head Self-Attention encoder (ST-MHSA), and we perform extensive experiments on NTU60, NTU120, and Kinetics400 benchmarks. MaskCLR consistently outperforms previous state-of-the-art methods on standard and perturbed skeletons from different pose estimators, showing improved accuracy, generalization, and robustness to skeleton perturbations. We make our implementation anonymously available at anonymous.4open.science/r/MaskCLR-A503.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1973
Loading