Online Continual Learning via Logit Adjusted Softmax

Zhehao Huang; Tao Li; Chenhe Yuan; Yingwen Wu; Xiaolin Huang

Online Continual Learning via Logit Adjusted Softmax

Zhehao Huang, Tao Li, Chenhe Yuan, Yingwen Wu, Xiaolin Huang

Published: 29 May 2024, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Online continual learning is a challenging problem where models must learn from a non-stationary data stream while avoiding catastrophic forgetting. Inter-class imbalance during training has been identified as a major cause of forgetting, leading to model prediction bias towards recently learned classes. In this paper, we theoretically analyze that inter-class imbalance is entirely attributed to imbalanced class-priors, and the function learned from intra-class intrinsic distributions is the optimal classifier that minimizes the class-balanced error. To that end, we present that a simple adjustment of model logits during training can effectively resist prior class bias and pursue the corresponding optimum. Our proposed method, Logit Adjusted Softmax, can mitigate the impact of inter-class imbalance not only in class-incremental but also in realistic scenarios that sum up class and domain incremental learning, with little additional computational cost. We evaluate our approach on various benchmarks and demonstrate significant performance improvements compared to prior arts. For example, our approach improves the best baseline by 4.6% on CIFAR10.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: * We have acknowledged and discussed the similar motivation of [1] in Section 3. * We have replaced the "general case" in the main text with the "sum case" and provided them more specific descriptions. * We have replaced the "Bayes optimal classifier" with a more specific expression. * We have incorporated ER-CBA[2] as a baseline in our experiments in Section 6 and provided implementation details of it in Appendix E.1. * We have provided a comprehensive experiment on C-MNIST in Appendix F.1. * We have provided detailed experiment results of LAS without rehearsal in Appendix F.2, and discussed our drawback on rehearsal-free online CL applications in limitations of Section 8. * We have included explanations of the class-balance accuracy in the experiment discussion on iNaturalist and metrics in Appendix D. * We have added our setting after every "SOTA". * We have added a hyperlink to Algorithm 3 for readers interested in details of combining distillation with our method. * We have made corrections to some typos and references.

Code: https://github.com/K1nght/online_CL_logit_adjusted_softmax

Supplementary Material: zip

Assigned Action Editor: ~ERIC_EATON1

Submission Number: 1851

Loading