RASNet: A Reinforcement Assistant Network for Frame Selection in Video-based Posture Recognition

Published: 01 Jan 2023, Last Modified: 17 Apr 2025ICME 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Most existing video-based posture recognition methods treat frames equally using unified or random sampling strategies, thus losing the temporal relationship information among frames. To address this problem, we propose a lightweight framework, namely RASNet, to adaptively select informative frames for recognition. Specifically, we design a video-suited exploration environment to guide the agent in learning the selection strategy. We introduce the reparametrization method to convert the discrete action space into a continuous space, making the agent robust and random. For the reward part, we design a multi-factor function to reward the agent keeping a balance between frame usage and accuracy. Extensive experiments on three large-scale datasets prove the effectiveness of RASNet, e.g., achieving 85.9% accuracy with fewer 1.15 frames than other state-of-the-art methods on Kinetics 600.
Loading