Spurious Features in Continual Learning

TMLR Paper780 Authors

16 Jan 2023 (modified: 17 Sept 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Continual Learning (CL) is a field of research that addresses training scenarios where the data distribution changes over time. One of its key challenges is learning without forgetting. To achieve this, CL algorithms need to learn stable and robust representations that can generalize to new data. However, since data is acquired gradually, the learned representations may be biased and require validation with more data. This paper investigates the impact of spurious features on CL algorithms. We show that these algorithms can learn to rely on features that are not generalizable, leading to poor performance in both memorization and generalization. We identify two related problems: (1) spurious features (SF), which result from a shift in the data distribution between training and testing, and (2) local spurious features (LSF), which arise due to the limited access to data at each training step. To study the impact of (1), we conduct a series of experiments that vary the amount of spurious correlation of the data distribution. We also propose an experimental setup to estimate the influence of (2) in usual continual learning scenarios. Our results show that (1) and (2) can lead to model overfitting and lead to performance degradation in CL, in addition to catastrophic forgetting (CF). By highlighting the influence of (local) spurious features in CL algorithms, this paper offers a novel perspective on performance decrease in continual learning.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: The change since the last submission aimed to address reviewers' comments as described in the provided answers. - improves notations and definitions - improves writing of several sections All modified text are put in blue.
Assigned Action Editor: ~Pierre_Alquier1
Submission Number: 780
Loading