Reinitializing weights vs hidden units for maintaining plasticity in neural networks

J. Fernando Hernandez-Garcia; Shibhansh Dohare; Richard S. Sutton

Reinitializing weights vs hidden units for maintaining plasticity in neural networks

J. Fernando Hernandez-Garcia, Shibhansh Dohare, Richard S. Sutton

24 Sept 2024 (modified: 25 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Continual Learning, Supervised Learning, Deep Learning, Loss of Plasticity

TL;DR: We compare reinitializing hidden units vs reinitializing weights in neural networks for maintaining plasticity in continual supervised learning; reinitializing weights was more effective in a wider variety of the settings that we tested.

Abstract: Loss of plasticity is a phenomenon where a neural network loses its ability to learn when trained for an extended time on non-stationary data. It is a crucial problem to overcome when designing systems that learn continually. An effective technique for preventing loss of plasticity is reinitializing parts of the network. In this paper, we compare two different reinitialization schemes: reinitializing units vs reinitializing weights. We propose a new algorithm named \textit{selective weight reinitialization} for reinitializing the least useful weights in the network. We compare our algorithm to continual backpropagation, a previously proposed algorithm that reinitializes units. Through our experiments in continual supervised learning problems, we identify two settings when reinitializing weights is more effective at maintaining plasticity than reinitializing units: (1) when the network has a small number of units and (2) when the network includes layer normalization. Conversely, reinitializing weights and units are equally effective at maintaining plasticity when the network is of sufficient size and does not include layer normalization. We found that reinitializing weights maintains plasticity in a wider variety of settings than reinitializing units.

Primary Area: transfer learning, meta learning, and lifelong learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 3682

Loading