Joint-Wise Temporal Self-Similarity Periodic Selection Network for Repetitive Fitness Action Counting
Abstract: Accurate repetitive action counting has crucial applications in the era of AI-assisted universal fitness. Existing methods are prone to large errors in spatially fine-grained action counting scenarios. In this study, we propose a joint-wise temporal self-similarity periodic selection network (JTSPS-Net) with a human skeleton as its input. Periodic knowledge is embedded in skeleton joint units and selected in a coarse-to-fine manner to focus on the temporal repetition that occurs in the local space. The proposed JTSPS-Net adopts a temporal multiscale fusion strategy to better handle videos with various lengths. To maintain the interpretability of the model, we design an impulse map regression module that uses one random frame per action unit as its labels. Furthermore, to fill the action counting gap in real physical fitness scenarios and to scale up the current repetition count dataset, we construct a high-quality dataset named FitnessRep, which consists of 2,110 fitness videos collected in realistic scenarios. Experiments demonstrate that the proposed JTSPS-Net outperforms the state-of-the-art approach on our dataset and two other public datasets, especially on fine-grained action samples. In addition, it has a good ability to generalize to repetitive actions belonging to unseen categories.
Loading