Regularizing Trajectories to Mitigate Catastrophic ForgettingDownload PDF

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone
TL;DR: Regularizing the optimization trajectory with the Fisher information of old tasks reduces catastrophic forgetting greatly
Abstract: Regularization-based continual learning approaches generally prevent catastrophic forgetting by augmenting the training loss with an auxiliary objective. However in most practical optimization scenarios with noisy data and/or gradients, it is possible that stochastic gradient descent can inadvertently change critical parameters. In this paper, we argue for the importance of regularizing optimization trajectories directly. We derive a new co-natural gradient update rule for continual learning whereby the new task gradients are preconditioned with the empirical Fisher information of previously learnt tasks. We show that using the co-natural gradient systematically reduces forgetting in continual learning. Moreover, it helps combat overfitting when learning a new task in a low resource scenario.
Keywords: Continual Learning, Regularization, Adaptation, Natural Gradient
Original Pdf: pdf
10 Replies

Loading