Continual Learning by Modeling Intra-Class Variation

Longhui Yu; Tianyang Hu; Lanqing HONG; Zhen Liu; Adrian Weller; Weiyang Liu

Continual Learning by Modeling Intra-Class Variation

Longhui Yu, Tianyang Hu, Lanqing HONG, Zhen Liu, Adrian Weller, Weiyang Liu

Published: 09 Mar 2023, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: It has been observed that neural networks perform poorly when the data or tasks are presented sequentially. Unlike humans, neural networks suffer greatly from catastrophic forgetting, making it impossible to perform life-long learning. To address this issue, memory-based continual learning has been actively studied and stands out as one of the best-performing methods. We examine memory-based continual learning and identify that large variation in the representation space is crucial for avoiding catastrophic forgetting. Motivated by this, we propose to diversify representations by using two types of perturbations: model-agnostic variation (i.e., the variation is generated without the knowledge of the learned neural network) and model-based variation (i.e., the variation is conditioned on the learned neural network). We demonstrate that enlarging representational variation serves as a general principle to improve continual learning. Finally, we perform empirical studies which demonstrate that our method, as a simple plug-and-play component, can consistently improve a number of memory-based continual learning methods by a large margin.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: We sincerely thank the action editor and all the reviewers for spending the efforts and time to improve our paper. In our camera ready revision, we make the following changes: - We included missing training details in Appendix A (Implementation Details). - We gave extensive discussion and comparison for the angle-based classifier in Appendix D (Additional Experimental Results and Discussions). - We added the convergence experiments and the discussions regarding WAP in Appendix D4. - Finally, we open-sourced our PyTorch implementation for full reproducibility. The GitHub link is provided in the paper.

Code: https://github.com/yulonghui/MOCA

Assigned Action Editor: ~Tie-Yan_Liu1

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 493

Loading