Generalisation Guarantees For Continual Learning With Orthogonal Gradient Descent

Mehdi Abbana Bennani; Thang Doan; Masashi Sugiyama

Generalisation Guarantees For Continual Learning With Orthogonal Gradient Descent

Mehdi Abbana Bennani, Thang Doan, Masashi Sugiyama

28 Sept 2020 (modified: 22 Jun 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Continual Learning, Neural Tangent Kernel, Optimisation

Abstract: In Continual Learning settings, deep neural networks are prone to Catastrophic Forgetting. Orthogonal Gradient Descent (Farajtabar et al., 2019) was proposed to tackle the challenge. However, no theoretical guarantees have been proven yet. We present a theoretical framework to study Continual Learning algorithms in the NTK regime. This framework comprises closed form expression of the model through tasks and proxies for transfer learning, generalisation and tasks similarity. In this framework, we prove that OGD is robust to Catastrophic Forgetting then derive the first generalisation bound for SGD and OGD for Continual Learning. Finally, we study the limits of this framework in practice for OGD and highlight the importance of the NTK variation for Continual Learning.

One-sentence Summary: An NTK framework for Continual Learning, with robustness and generalisation guarantees for Orthogonal Gradient Descent.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/generalisation-guarantees-for-continual/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=9gNv_JhRKC

10 Replies

Loading