Online Continual Learning via Logit Adjusted Softmax

Published: 29 May 2024, Last Modified: 29 May 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Online continual learning is a challenging problem where models must learn from a non-stationary data stream while avoiding catastrophic forgetting. Inter-class imbalance during training has been identified as a major cause of forgetting, leading to model prediction bias towards recently learned classes. In this paper, we theoretically analyze that inter-class imbalance is entirely attributed to imbalanced class-priors, and the function learned from intra-class intrinsic distributions is the optimal classifier that minimizes the class-balanced error. To that end, we present that a simple adjustment of model logits during training can effectively resist prior class bias and pursue the corresponding optimum. Our proposed method, Logit Adjusted Softmax, can mitigate the impact of inter-class imbalance not only in class-incremental but also in realistic scenarios that sum up class and domain incremental learning, with little additional computational cost. We evaluate our approach on various benchmarks and demonstrate significant performance improvements compared to prior arts. For example, our approach improves the best baseline by 4.6% on CIFAR10.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: * We have acknowledged and discussed the similar motivation of [1] in Section 3. * We have replaced the "general case" in the main text with the "sum case" and provided them more specific descriptions. * We have replaced the "Bayes optimal classifier" with a more specific expression. * We have incorporated ER-CBA[2] as a baseline in our experiments in Section 6 and provided implementation details of it in Appendix E.1. * We have provided a comprehensive experiment on C-MNIST in Appendix F.1. * We have provided detailed experiment results of LAS without rehearsal in Appendix F.2, and discussed our drawback on rehearsal-free online CL applications in limitations of Section 8. * We have included explanations of the class-balance accuracy in the experiment discussion on iNaturalist and metrics in Appendix D. * We have added our setting after every "SOTA". * We have added a hyperlink to Algorithm 3 for readers interested in details of combining distillation with our method. * We have made corrections to some typos and references.
Supplementary Material: zip
Assigned Action Editor: ~ERIC_EATON1
Submission Number: 1851