Coreset Selection via Reducible Loss in Continual Learning

Ruilin Tong; Yuhang Liu; Javen Qinfeng Shi; Dong Gong

Coreset Selection via Reducible Loss in Continual Learning

Ruilin Tong, Yuhang Liu, Javen Qinfeng Shi, Dong Gong

Published: 22 Jan 2025, Last Modified: 02 Mar 2025ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Continual learning, Coreset selection

Abstract: Rehearsal-based continual learning (CL) aims to mitigate catastrophic forgetting by maintaining a subset of samples from previous tasks and replaying them. The rehearsal memory can be naturally constructed as a coreset, designed to form a compact subset that enables training with performance comparable to using the full dataset. The coreset selection task can be formulated as bilevel optimization that solves for the subset to minimize the outer objective of the learning task. Existing methods primarily rely on inefficient probabilistic sampling or local gradient-based scoring to approximate sample importance through an iterative process that can be susceptible to ambiguity or noise. Specifically, non-representative samples like ambiguous or noisy samples are difficult to learn and incur high loss values even when training on the full dataset. However, existing methods relying on local gradient tend to highlight these samples in an attempt to minimize the outer loss, leading to a suboptimal coreset. To enhance coreset selection, especially in CL where high-quality samples are essential, we propose a coreset selection method that measures sample importance using reducible loss (ReL) that quantifies the impact of adding a sample to model performance. By leveraging ReL and a process derived from bilevel optimization, we identify and retain samples that yield the highest performance gain. They are shown to be informative and representative. Furthermore, ReL requires only forward computation, making it significantly more efficient than previous methods. To better apply coreset selection in CL, we extend our method to address key challenges such as task interference, streaming data, and knowledge distillation. Experiments on data summarization and continual learning demonstrate the effectiveness and efficiency of our approach.

Primary Area: transfer learning, meta learning, and lifelong learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1231

Loading