Meta Continual Learning Revisited: Implicitly Enhancing Online Hessian Approximation via Variance Reduction

Yichen Wu; Long-Kai Huang; Renzhen Wang; Deyu Meng; Ying Wei

Meta Continual Learning Revisited: Implicitly Enhancing Online Hessian Approximation via Variance Reduction

Yichen Wu, Long-Kai Huang, Renzhen Wang, Deyu Meng, Ying Wei

Published: 16 Jan 2024, Last Modified: 17 Apr 2024ICLR 2024 oralEveryoneRevisionsBibTeX

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Continual Learning

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We provide a new perspective to Meta-continual learning and propose a Variance Reduction Meta-CL based on the novel understanding.

Abstract: Regularization-based methods have so far been among the *de facto* choices for continual learning. Recent theoretical studies have revealed that these methods all boil down to relying on the Hessian matrix approximation of model weights. However, these methods suffer from suboptimal trade-offs between knowledge transfer and forgetting due to fixed and unchanging Hessian estimations during training. Another seemingly parallel strand of Meta-Continual Learning (Meta-CL) algorithms enforces alignment between gradients of previous tasks and that of the current task. In this work we revisit Meta-CL and for the first time bridge it with regularization-based methods. Concretely, Meta-CL implicitly approximates Hessian in an online manner, which enjoys the benefits of timely adaptation but meantime suffers from high variance induced by random memory buffer sampling. We are thus highly motivated to combine the best of both worlds, through the proposal of Variance Reduced Meta-CL (VR-MCL) to achieve both timely and accurate Hessian approximation. Through comprehensive experiments across three datasets and various settings, we consistently observe that VR-MCL outperforms other SOTA methods, which further validates the effectiveness of VR-MCL.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Primary Area: transfer learning, meta learning, and lifelong learning

Submission Number: 4430

Loading