Keywords: Fourier, plasticity, neural networks, continual learning
Abstract: Deep neural networks can struggle to learn continually in the face of non-stationarity, a
phenomenon known as loss of plasticity.
In this paper, we identify underlying principles that lead to plastic algorithms.
We provide theoretical results showing that linear function approximation, as well as a special case of deep linear networks, do not suffer from loss of plasticity.
We then propose deep Fourier features, which are the concatenation of a sine and cosine in every layer, and we show that this combination provides a dynamic balance between the trainability obtained through linearity and the effectiveness obtained through the nonlinearity of neural networks.
Deep networks composed entirely of deep Fourier features are highly trainable and sustain their trainability over the course of learning.
Our empirical results show that continual learning performance can be improved by replacing ReLU activations with deep Fourier features combined with regularization.
These results hold for different continual learning scenarios (e.g., label noise, class incremental learning, pixel permutations)
on all major supervised learning datasets used for continual learning research, such as CIFAR10, CIFAR100, and tiny-ImageNet.
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13172
Loading