Stochastic linear dynamics in parameters to deal with Neural Networks plasticity loss

Published: 28 Oct 2023, Last Modified: 02 Apr 2024DistShift 2023 PosterEveryoneRevisionsBibTeX
Keywords: online learning, non-stationarity, plasticity loss
TL;DR: A simple modification to the online SGD algorithm which takes into account the non-stationarity in the data
Abstract: Plasticity loss has become an active topic of interest in the continual learning community. Over time, when faced with non-stationary data, standard gradient descent loses its ability to learn. It comes in two forms, the inability of the network to generalize and its inability to fit the training data. Several causes have been proposed including ill-conditioning or the saturation of activation functions. In this work we focus on the inability of neural networks to optimize due to saturating activations, which particularly affects online reinforcement learning settings, where the learning process itself creates a non-stationary setting even if the environment is kept fixed. Recent works have proposed to answer this problem by relying on dynamically resetting units that seem inactive, allowing them to be tuned further. We explore an alternative approach to this based on stochastic linear dynamics in parameters which allows to model non-stationarity and provides a mechanism to adaptively and stochastically drift the parameters towards the prior, implementing a mechanism of soft parameters reset.
Submission Number: 96
Loading