Sequential Bayesian Continual Learning with Meta-Learned Neural Networks

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: sequential Bayes, meta-continual learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We introduce a general meta-continual learning framework that combines neural networks' strong representational power and simple statistical models' robustness to forgetting.
Abstract: In the present era of deep learning, continual learning research is mainly focused on mitigating forgetting when training a neural network with stochastic gradient descent (SGD) on a non-stationary stream of data. On the other hand, there is a wealth of research on sequential learning in the more classical literature of statistical machine learning. Many models in this literature have sequential Bayesian update rules that yield the same learning outcome as the batch training, i.e., they are completely immune to catastrophic forgetting. However, they suffer from underfitting when modeling complex distributions due to their weak representational power. In this work, we introduce a general meta-continual learning (MCL) framework that combines neural networks' strong representational power and simple statistical models' robustness to forgetting. In our framework, continual learning takes place only in a statistical model in the embedding space via a sequential Bayesian update rule, while meta-learned neural networks bridge the raw data and the embedding space. Since our approach is domain-agnostic and model-agnostic, it can be applied to a wide range of problems and easily integrated with existing model architectures. Compared to SGD-based MCL methods, our approach demonstrates significantly improved performance and scalability.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4775
Loading