Inter-Batch Cross-Attention: See More to Forget Less

ACL ARR 2024 June Submission4245 Authors

16 Jun 2024 (modified: 04 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Our paper presents a simple training strategy to help prevent catastrophic forgetting in continual learners, named Inter-Batch Cross-Attention (IBCA). We discover that adding an IBCA module at the input level can significantly increase the model's continual learning performance, with minimum memory and performance overhead. Our method makes minimum changes to existing transformer-based model architectures and can be used in parallel with other continual learning strategies. We demonstrate its effectiveness on class-incremental classification tasks on the 20 Newsgroups dataset.
Paper Type: Short
Research Area: Machine Learning for NLP
Research Area Keywords: continual learning
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
Submission Number: 4245
Loading