Keywords: continual learning, memory, catastrophic forgetting, generalization
Abstract: Continual learning (CL) is a paradigm that adapts to and retains knowledge from a stream of tasks. Despite the growing number of experimental methods in CL, there is a lack of rigorous theoretical analysis, particularly in memory-based continual learning (MCL), which remains an open research area. In this paper, we theoretically analyze the impact of memory in CL and derive explicit expressions for expected forgetting and generalization errors under overparameterized linear models. We propose a detailed matrix decomposition of the data to distinguish between current and previous datasets, effectively decoupling the coupled data for different tasks. Additionally, we conduct a comprehensive mathematical analysis for scenarios with a small number of tasks and employ numerical analysis for larger task scenarios to evaluate the overall properties of expected forgetting and generalization errors. Compared with CL, our theoretical analysis suggests that (1) a larger memory buffer must be paired with a larger model to effectively reduce forgetting; (2) training with a larger memory buffer generalizes better when tasks are similar but may perform worse when tasks are dissimilar, while training with a large model can help mitigate this negative effect. Ultimately, our findings here sheds new light on how memory can assist CL in mitigating catastrophic forgetting and improving generalization.
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13727
Loading