Temporal Boundary Awareness Network for Repetitive Action Counting

Published: 01 Jan 2025, Last Modified: 27 Aug 2025ACM Trans. Multim. Comput. Commun. Appl. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Repetitive Action Counting (RAC) is a critical and challenging task in video analysis, aiming to count the number of repeated actions in videos accurately. Existing methods typically generate a Temporal Self-similarity Matrix (TSM) as an intermediate representation to predict the number of repetitive actions. While this simplifies the process, it often overlooks the variable lengths between action cycles and the phenomenon of motion interruptions. The period inconsistency problem caused by the change in the action period and the motion interruption problem resulting from the motion pause are the two main challenges that affect the accuracy of RAC in complex scenes. To address these challenges, we propose a novel framework. First, we construct a boundary-aware encoder equipped with a temporal pyramid structure to build multi-scale video features, capturing the period information of different lengths of repetitive actions to solve the period inconsistency problem. Next, a cycle and boundary attention module is followed by each layer in the pyramid to enhance these multi-scale features with periodic and event boundary information. Finally, we design a gated density estimator to generate the actionness score for each frame that reflects the probability of the corresponding time point being within the motion cycle. These scores are used to weight features to reduce the impact of noise frames without actions present and solve the motion interruption problem for better density prediction. Extensive experiments conducted on public datasets demonstrate the effectiveness of our method. The source code will be available at https://github.com/zqzhang2023/TBANRAC.
Loading