Abstract: Class-incremental learning, a sub-field of continual learning, suffers from catastrophic forgetting, a phenomenon where models tend to forget previous tasks while learning new ones. Existing solutions to this problem can be categorized into expansion-based, memory-based, and regularization-based approaches. Most recent advances have focused on the first two categories. On the other hand, limited research has been undertaken for regularization-based methods that offer deployability with computational and memory efficiency. In this paper, we present Self-Supervised Curriculum-based Class Incremental Learning ($S^2C^2IL$), a novel regularization-based algorithm that significantly improves class-incremental learning performance without relying on external memory or network expansion. The key to $S^2C^2IL$ is the use of self-supervised learning to extract rich feature representations from the data available for each task. We introduce a new pretext task that employs stochastic label augmentation instead of traditional image augmentation. To preclude the pretext task-specific knowledge from being transferred to downstream tasks, we leave out the final section of the pre-trained network in feature transfer. In the downstream task, we use a curriculum strategy to periodically vary the standard deviation of the filter fused with the network. We evaluate the proposed $S^2C^2IL$ using an orthogonal weight modification backbone on four benchmark datasets, split-CIFAR10, split-CIFAR100, split-SVHN, and split-TinyImageNet and two high-resolution datasets, split-STL10, and ImageNet100. The results show that $S^2C^2IL$ archives state-of-the-art results compared to existing regularization-based and memory-based methods in class-incremental learning algorithms.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: We would like to express our gratitude to the reviewers for their valuable feedback and suggestions to improve our paper. In response to their comments, we have made several revisions and would like to highlight the key changes:
1. Updated the Abstract of the paper (page 1) and general writing of the paper.
2. Updated the motivation of SSL in the context of Continual Learning, and how stochastic labels are sampled (page 2, section 1 and page 5, section 3.1, and page 5, section 3.1).
3. Incorporated the explanation of equation 2 (page 7, section 3.2.1, 2nd paragraph).
4. Added the discussion of some existing work (pages 3 and 4, section 2) and updated the figure (page 2).
5. Added the explanation for the intuition behind SLA for better clarity for the readers (pages 12 and 13, section 5).
6. Fixed the citations (pages 3 and 4, section 2, and minor edits in the other part of the paper).
7. Added an experiment to better understand the contribution of multitask learning in the SLA-based pretraining (page 13, section 5).
8. Updated details in the Implementation Details for better reproducibility (page 11, section 4).
9. Added statistical tests for different experiments (page 13, section 5).
10. Added more experiments on high-resolution datasets like split-STL10 and ImageNet-100 datasets (page 16, section 5).
11. Updated Table 3 with results obtained without pretraining (page 11).
12. Added a subsection on the Computational Time of the SLA and S2C2IL (page 14, section 5).
Assigned Action Editor: ~Joao_Carreira1
Submission Number: 1020
Loading