TCA-VAD: Temporal Context Alignment Network for Weakly Supervised Video Anomly Detection

Published: 01 Jan 2022, Last Modified: 13 Nov 2024ICME 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Video Anomaly detection (VAD) with weakly supervised is usually formulated as a multiple instance learning (MIL) problem. Although the current MIL-based methods have achieved promising detection performance, the temporal dependencies in videos are not well exploited. There may multiple abnormal clips in a given anomaly video, while the previous work only focused on the most abnormal one. To address above issues, a temporal context alignment (TCA) network for video anomaly detection is proposed in this work. Its merits are three-fold, 1) a sparse continuous sampling strategy is proposed to adapt the varying length of untrimmed videos; 2) a multi-scale attention module is used to establish the video temporal dependencies; 3) a top-k loss strategy is used to enlarge the distance between the top-k normal and abnormal clips. Extensive experiments demonstrate the noticeable anomaly discriminability of the proposed network on two public datasets (ShanghaiTech and UCF-Crime).
Loading