Keywords: continual learning, class-incremental learning
TL;DR: We apply symmetric distillation to separate learning new representations from merging existing ones, succesfully mitigating task bias and the stability gap.
Abstract: Continual learning strives to train a model in a sequential manner by learning from new tasks while retaining information about old tasks. Treating this as a common classification problem leads to catastrophic forgetting, especially in deep learning settings, where knowledge of old tasks is forgotten as soon as a model is optimized on new tasks. Existing solutions tackle this problem by imposing strict assumptions, such as the availability of exemplars from previously seen classes or a warm start of a model on many classes before starting the continual learning. While effective on known benchmarks, such assumptions can be impractical and do not directly address the stability-plasticity dilemma in continual learning. In this paper, we follow a recent push in the field to tackle continual learning in the exemplar-free cold-start setting. We propose Model-in-the-Middle (MITM). The idea behind MITM is to separate the learning of new classes and retention of past class knowledge by using two distinct models. We propose a learner with symmetric distillation from both models, enabling us to learn evolving representations as new tasks arrive. We show that explicitly separating and balancing old and new tasks through symmetric distillation helps absorb large distribution shifts in between tasks, mitigating the stability gap. Our approach is simple yet outperforms the state-of-the-art in the challenging exemplar-free cold-start continual learning setting.
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6544
Loading