LUNCH: Adaptive Balancing of Continual Learning via Hyperparameter Uncertainty

Qingyi Pan; Liyuan Wang; Jingyi Zhang; Jun Zhu

LUNCH: Adaptive Balancing of Continual Learning via Hyperparameter Uncertainty

Qingyi Pan, Liyuan Wang, Jingyi Zhang, Jun Zhu

23 Sept 2024 (modified: 13 Dec 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Life-long Learning; Uncertainty; Hyperparameter Sensitivity

TL;DR: We propose an innovative approach called Learning UNCertain Hyperparameters (LUNCH) for adaptive balancing of task contributions in CL.

Abstract: Continual learning (CL) is characterized by learning sequentially arriving tasks and behaving as if they were observed simultaneously. In order to prevent catastrophic forgetting of old tasks when learning new tasks, representative CL methods usually employ additional loss terms to balance their contributions (e.g., regularization and replay), modulated by deterministic hyperparameters. However, this strategy struggles to accommodate real-time changes in data distributions and is also lack of robustness to subsequent unseen tasks, especially in online scenarios where CL is performed with a one-pass data stream. Inspired by adaptive weighting in multi-task learning, we propose an innovative approach named Learning UNCertain Hyperparameters (LUNCH) for adaptive balancing of task contributions in CL. Specifically, we formulate each CL-relevant hyperparameter as a function of optimizable uncertainty under homoscedastic assumption and ensure its training stability through the exponential moving average of network parameters. We further devise an evaluation protocol that moderately adjusts the hyperparameter values and reports their impact on performance, so as to analyze the sensitivity of these sub-optimal values in realistic applications. We perform extensive experiments to demonstrate the effectiveness and robustness of our approach, which significantly improves online CL in a plug-in manner (e.g., up to 11.26% and 5.64% on Split CIFAR-100 and Split Mini-ImageNet, respectively) as well as offline CL.

Supplementary Material: zip

Primary Area: transfer learning, meta learning, and lifelong learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 3291

Loading