Full Elastic Weight Consolidation via the Surrogate Hessian-Vector Product

23 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Elastic Weight Consolidation, Continual Learning, Initialization Regimes
TL;DR: An analysis of EWC using the full FIM: We find that "full EWC" has complementary strength and weaknesses to standard EWC using the diagaonal FIM. We combine the two approaches which displays the strengths and mitigates the weaknesses of both.
Abstract: Elastic weight consolidation (EWC) is a widely accepted method for preventing catastrophic forgetting while learning a series of tasks. The key computation involved in EWC is the Fisher Information Matrix (FIM), which identifies the parameters that are crucial to previous tasks and should not be altered during new learning. However, the practical application of the FIM (a square matrix that is the same size as the number of parameters) has been limited by computational difficulties. As a result, previous uses of EWC have only employed the diagonal elements, or at most diagonal blocks, of the matrix. In this work, we introduce a method for obtaining the gradient step for EWC with the full FIM, which is both memory and computationally efficient. We evaluate the advantages of using the full FIM over just the diagonal in EWC on supervised and reinforcement learning tasks and our results demonstrate a quantitative difference between the two approaches, which are more effective when used in combination. Finally we show both empirically and theoretically that the benefits of using the full FIM are greater when the network is initialised in the lazy regime rather than the feature learning regime.
Supplementary Material: zip
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7245
Loading