Recovering Complete Actions for Cross-dataset Skeleton Action Recognition

Hanchao Liu; Yujiang Li; Tai-Jiang Mu; Shi-min Hu

Recovering Complete Actions for Cross-dataset Skeleton Action Recognition

Hanchao Liu, Yujiang Li, Tai-Jiang Mu, Shi-min Hu

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: skeleton action recognition, domain generalization, data augmentation

TL;DR: We present a novel recover-and-resample augmentation framework based on complete action prior for skeleton action generalization task.

Abstract: Despite huge progress in skeleton-based action recognition, its generalizability to different domains remains a challenging issue. In this paper, to solve the skeleton action generalization problem, we present a recover-and-resample augmentation framework based on a novel complete action prior. We observe that human daily actions are confronted with temporal mismatch across different datasets, as they are usually partial observations of their complete action sequences. By recovering complete actions and resampling from these full sequences, we can generate strong augmentations for unseen domains. At the same time, we discover the nature of general action completeness within large datasets, indicated by the per-frame diversity over time. This allows us to exploit two assets of transferable knowledge that can be shared across action samples and be helpful for action completion: boundary poses for determining the action start, and linear temporal transforms for capturing global action patterns. Therefore, we formulate the recovering stage as a two-step stochastic action completion with boundary pose-conditioned extrapolation followed by smooth linear transforms. Both the boundary poses and linear transforms can be efficiently learned from the whole dataset via clustering. We validate our approach on a cross-dataset setting with three skeleton action datasets, outperforming other domain generalization approaches by a considerable margin.

Primary Area: Machine vision

Submission Number: 3759

Loading