Dynamic Erasing Network With Adaptive Temporal Modeling for Weakly Supervised Video Anomaly Detection

Chen Zhang, Guorong Li, Yuankai Qi, Hanhua Ye, Laiyun Qing, Ming-Hsuan Yang, Qingming Huang

Published: 01 Jan 2025, Last Modified: 09 Nov 2025IEEE Transactions on Neural Networks and Learning SystemsEveryoneRevisionsCC BY-SA 4.0
Abstract: The weakly supervised video anomaly detection aims to learn a detection model using only video-level labeled data. Prior studies ignore the complexity or duration of anomalies present in abnormal videos during temporal modeling. Moreover, existing works usually detect the most abnormal segments, potentially overlooking the completeness of anomalies. We propose a dynamic erasing network (DE-Net) for weakly supervised video anomaly detection, which learns video-specific temporal features via adaptive temporal modeling (ATM) to address these limitations. Specifically, to handle duration variations of abnormal events, we propose an ATM module capable of adaptively selecting and aggregating the most appropriate K temporal scale features for each video. Then, we design a dynamic erasing (DE) strategy that dynamically assesses the completeness of the detected anomalies and erases prominent abnormal segments to encourage the model to discover gentle abnormal segments. The proposed method achieves favorable performance compared to several state-of-the-art approaches on the widely used XD-Violence, TAD, and UCF-Crime datasets.
Loading