Discriminative Regions Erasing Strategy for Weakly-Supervised Temporal Action Localization

Published: 01 Jan 2020, Last Modified: 19 Apr 2024PRCV (2) 2020EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Weakly-supervised temporal action localization (WTAL) has recently attracted attentions. Many of the state-of-the-art methods usually utilize temporal class activation map (T-CAM) to obtain target action temporal regions. However, class-specific T-CAM tends to cover only the most discriminative part of the actions, not the entire action. In this paper, we propose an erasing strategy for mining discriminative regions in weakly-supervised temporal action localization (DRES). DRES achieves better performance with action localization, which can be attribute to two aspects. First, we employ the salient detection module, which suppresses the background to obtain the most discriminative regions. Second, we design the eraser module to discover the missed action regions by the salient detection module, which complements action regions. Based on experiments, we demonstrate that DRES improve the state-of-the-art performance on THUMOS’14.
Loading