Time-Class Cross-Attention Classifier for Exemplar-Free Continual Learning in Video Action Recognition

Minjoo Ki

Published: 30 Jan 2024, Last Modified: 17 Nov 2024OpenReview Archive Direct UploadEveryoneCC BY-SA 4.0

Abstract: In the context of video domains, continual learning has tra- ditionally relied on data storage to prevent forgetting the knowledge learned in the previous tasks. However, due to the substantial size of data compared to images, it costs signif- icant storage complexity and time complexity to store and select important frames. To this end, we explore methods to maintain prior information without storing or reusing data, proposing a Time-Class Cross-Attention Classifier for con- tinual learning in video action recognition. We employ learn- able class queries to compress class knowledge, and a cross- attention classifier architecture to capture the relationship be- tween class queries and temporal information in videos. Then we transfer information from the previous cross-attention classifier when learning new tasks to preserve the necessary temporal cues for the classification of previous classes. Ex- perimental results show that the proposed model significantly improves performance in scenarios regardless of whether data reuse is feasible or not, offering a novel perspective on con- tinual learning in the field of action recognition. Our code will be made available.