Implicit biases in multitask and continual learningfrom a backward error analysis perspective

Published: 07 Nov 2023, Last Modified: 13 Dec 2023M3L 2023 PosterEveryoneRevisionsBibTeX
Keywords: implicit regularization, continuous approximations of training trajectories, backward error analysis, continual learning, multitask learning, Lie bracket, modified losses, gradient descent
TL;DR: Using backward error analysis, we compute implicit biases in multitask and continual learning both good (flatness bias) and bad (gradient conflict)
Abstract: Using backward error analysis, we compute implicit training biases in multitask and continual learning settings for neural networks trained with stochastic gradient descent. In particular, we derive modified losses that are implicitly minimized during training. They have three terms: the original loss, accounting for convergence, an implicit flatness regularization term proportional to the learning rate, and a last term, the conflict term, which can theoretically be detrimental to both convergence and implicit regularization. In multitask, the conflict term is a well-known quantity, measuring the gradient alignment between the tasks, while in continual learning the conflict term is a new quantity in deep learning optimization, although a basic tool in differential geometry: The Lie bracket between the task gradients.
Submission Number: 2
Loading