Mitigating Interference in the Knowledge Continuum through Attention-Guided Incremental Learning

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Continual Learning, Catastrophic Forgetting, Experience Rehearsal, Class Incremental Learning, Task Incremental Learning, Lifelong Learning, Task Attention
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: A novel rehearsal based continual learning approach that utilizes lightweight, learnable task projection vectors to reduce interference between tasks and improve class-incremental learning performance
Abstract: Continual learning (CL) remains a significant challenge for deep neural networks, as it is prone to forgetting previously acquired knowledge. Several approaches have been proposed in the literature, such as experience rehearsal, regularization, and parameter isolation, to address this problem. Although almost zero forgetting can be achieved in task-incremental learning, class-incremental learning remains highly challenging due to the problem of inter-task class separation. Limited access to previous task data makes it difficult to discriminate between classes of current and previous tasks. To address this issue, we propose `Attention-Guided Incremental Learning' (AGILE), a novel rehearsal-based CL approach that incorporates compact task-attention to effectively reduce interference between tasks. AGILE utilizes lightweight, learnable task projection vectors to transform the latent representations of a shared task-attention module toward task distribution. Through extensive empirical evaluation we show that AGILE significantly improves generalization performance by mitigating task interference and outperforms rehearsal-based approaches in several CL scenarios. Furthermore AGILE can scale well to a large number of tasks with minimal overhead while remaining well-calibrated with reduced task-recency bias.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7288
Loading