Keywords: Continual Learning, Class-Incremental Learning, Sparsity-based Learning
Abstract: The primary challenge in continual learning is navigating the plasticity-stability dilemma to balance the acquisition of new knowledge with the retention of old. While leveraging pretrained models has significantly advanced continual learning, existing methods exhibit a scalability bottleneck on long task sequences, suffering from performance degradation due to parameter interference and loss of plasticity. In this work,
inspired by evidence that sparse fine-tuning achieves performance comparable to full fine-tuning, we introduce a novel sparsity-driven continual learning framework. Our continual learning method termed CLARE operates in two stages: it first identifies a sparse, task-critical parameter mask via a sparsity-inducing objective, then performs mask-constrained fine-tuning. In addition, to further reduce interference, we incorporate a gradual forgetting mechanism that resets a tiny fraction of previously accumulated parameters after learning each new task. Furthermore, to address the lack of benchmark datasets for long-sequence continual learning, we curate ImageNet-CIL-1K, a challenging long-sequence dataset with 1,069,563 images and 1,000 classes. Extensive experiments demonstrate the scalability of CLARE. On ImageNet-CIL-1K with 100 tasks, CLARE outperforms strong baselines such as APER and MagMax by 4-6\% in overall test accuracy, and leads EASE by over 10\%, establishing a new state of the art for long-sequence continual learning.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 5799
Loading