On Plasticity, Invariance, and Mutually Frozen Weights in Sequential Task LearningDownload PDF

Published: 09 Nov 2021, Last Modified: 05 May 2023NeurIPS 2021 PosterReaders: Everyone
Keywords: Transfer learning, Deep learning, Sparsity, Invariance, Sequential learning, Critical learning periods, Curriculum learning
TL;DR: Training a deep network with weight decay can both improve invariance and generalization but also remove the ability to adapt to new learning tasks due to a type of sparsity we call mutually frozen weights.
Abstract: Plastic neural networks have the ability to adapt to new tasks. However, in a continual learning setting, the configuration of parameters learned in previous tasks can severely reduce the adaptability to future tasks. In particular, we show that, when using weight decay, weights in successive layers of a deep network may become "mutually frozen". This has a double effect: on the one hand, it makes the network updates more invariant to nuisance factors, providing a useful bias for future tasks. On the other hand, it can prevent the network from learning new tasks that require significantly different features. In this context, we find that the local input sensitivity of a deep model is correlated with its ability to adapt, thus leading to an intriguing trade-off between adaptability and invariance when training a deep model more than once. We then show that a simple intervention that "resets" the mutually frozen connections can improve transfer learning on a variety of visual classification tasks. The efficacy of "resetting" itself depends on the size of the target dataset and the difference of the pre-training and target domains, allowing us to achieve state-of-the-art results on some datasets.
Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.
Supplementary Material: pdf
Code: zip
11 Replies