Architecture Matters in Continual LearningDownload PDF

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone
Keywords: Continual Learning, Catastrophic Forgetting, Neural Network Architecture
TL;DR: The choice of architecture can significantly impact the continual learning performance.
Abstract: A large body of research in continual learning is devoted to overcoming the catastrophic forgetting of neural networks by designing new algorithms that are robust to the distribution shifts. However, the majority of these works are strictly focused on the "algorithmic" part of continual learning for a "fixed neural network architecture", and the implications of using different architectures are not clearly understood. The few existing continual learning methods that expand the model also assume a fixed architecture and develop algorithms that can efficiently use the model throughout the learning experience. In contrast, in this work, we build on existing works that study continual learning from a neural network's architecture perspective and provide new insights into how the architecture choice, for the same learning algorithm, can impact stability-plasticity trade-off resulting in markedly different continual learning performance. We empirically analyze the impact of various architectural components providing best practices and recommendations that can improve the continual learning performance irrespective of the learning algorithm.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
4 Replies

Loading