Inter-Batch Cross-Attention: See More to Forget Less

Inter-Batch Cross-Attention: See More to Forget Less

ACL ARR 2024 June Submission4245 Authors

16 Jun 2024 (modified: 04 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Our paper presents a simple training strategy to help prevent catastrophic forgetting in continual learners, named Inter-Batch Cross-Attention (IBCA). We discover that adding an IBCA module at the input level can significantly increase the model's continual learning performance, with minimum memory and performance overhead. Our method makes minimum changes to existing transformer-based model architectures and can be used in parallel with other continual learning strategies. We demonstrate its effectiveness on class-incremental classification tasks on the 20 Newsgroups dataset.

Paper Type: Short

Research Area: Machine Learning for NLP

Research Area Keywords: continual learning

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

Submission Number: 4245

Loading