Continual Learning on a Data Diet

Continual Learning on a Data Diet

TMLR Paper3604 Authors

31 Oct 2024 (modified: 20 Jan 2025)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Continual Learning (CL) methods usually learn from all the available data. However, this is not the case in human cognition which efficiently focuses on key experiences while disregarding the redundant information. Similarly, not all data points in a dataset have equal potential; some can be more informative than others. Especially in CL, such redundant or low-quality data can be detrimental for learning efficiency and exacerbate catastrophic forgetting. Drawing inspiration from this, we explore the potential of learning from important samples and present an empirical study for evaluating coreset selection techniques in the context of CL to stimulate research in this unexplored area. We train different continual learners on increasing amounts of selected samples and elucidate the learning-forgetting dynamics by shedding light on the underlying mechanisms driving their improved stability-plasticity balance. We present several significant observations: learning from selectively chosen samples (i) enhances incremental accuracy, (ii) improves knowledge retention of previous tasks, and (iii) continually refines learned representations. This analysis contributes to a deeper understanding of selective learning strategies in CL scenarios. The code is available at https://anonymous.4open.science/r/Data-Diet-CD87.

Submission Length: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Cedric_Archambeau1

Submission Number: 3604

Loading