Continual Learning with Orthogonal Weights and Knowledge Transfer

23 Sept 2023 (modified: 01 Mar 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Continual Learning, Catastrophic Forgetting, Knowledge Transfer, Orthogonal Gradients Projection, Weight/Parameter-Level Orthogonality
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We proposed a novel TIL method with orthogonal weights and knowledge transfer.
Abstract: Orthogonal projection has been shown highly effective at overcoming *catastrophic forgetting* (CF) in continual learning (CL). Existing orthogonal projection methods are *all* based on *orthogonal gradients* (OG) between tasks. However, this paper shows theoretically that OG cannot guarantee CF elimination, which is a major limitation of the existing OG-based CL methods. Our theory further shows that only the *weight/parameter-level orthogonality* between tasks can guarantee CF elimination as the final classification is computed based on the network weights/parameters only. Existing OG-based methods also have two other *inherent limitations*, i.e., *over-consumption of network capacity* and *limiting knowledge transfer* (KT) across tasks. KT is also a core objective of CL. This paper then proposes a novel *weight-level orthogonal projection* method (called STIL), which ensures that each task occupies a weight subspace that is orthogonal to those of the other tasks. The method also addresses the two other limitations of the OG-based methods. Extensive evaluations show that the proposed STIL not only overcomes CF better than baselines, but also, perhaps more importantly, performs KT much better than them.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6996
Loading