Learning a Condensed Frame for Memory-Efficient Video Class-Incremental Learning

Yixuan Pei; Zhiwu Qing; Jun CEN; Xiang Wang; Shiwei Zhang; Yaxiong Wang; Mingqian Tang; Nong Sang; Xueming Qian

Learning a Condensed Frame for Memory-Efficient Video Class-Incremental Learning

Yixuan Pei, Zhiwu Qing, Jun CEN, Xiang Wang, Shiwei Zhang, Yaxiong Wang, Mingqian Tang, Nong Sang, Xueming Qian

Published: 31 Oct 2022, Last Modified: 11 Jan 2023NeurIPS 2022 AcceptReaders: Everyone

Abstract: Recent incremental learning for action recognition usually stores representative videos to mitigate catastrophic forgetting. However, only a few bulky videos can be stored due to the limited memory. To address this problem, we propose FrameMaker, a memory-efficient video class-incremental learning approach that learns to produce a condensed frame for each selected video. Specifically, FrameMaker is mainly composed of two crucial components: Frame Condensing and Instance-Specific Prompt. The former is to reduce the memory cost by preserving only one condensed frame instead of the whole video, while the latter aims to compensate the lost spatio-temporal details in the Frame Condensing stage. By this means, FrameMaker enables a remarkable reduction in memory but keep enough information that can be applied to following incremental tasks. Experimental results on multiple challenging benchmarks, i.e., HMDB51, UCF101 and Something-Something V2, demonstrate that FrameMaker can achieve better performance to recent advanced methods while consuming only 20% memory. Additionally, under the same memory consumption conditions, FrameMaker significantly outperforms existing state-of-the-arts by a convincing margin.

Supplementary Material: pdf

21 Replies

Loading