Continual Memory Neurons

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Neuron Model, Online Continual Learning, Replay-buffer-free Learning, Self-organized Memories
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: A novel low-level approach to continual learning from a stream of data. A new neuron model that generalizes classic neurons, autonomously learning to address and merge internal memory units, better isolating computations and preserving information.
Abstract: Learning with neural networks by continuously processing a stream of data is very related to the way humans learn from perceptual information. However, when data is not i.i.d., it is largely known that it is very hard to find a good trade-off between plasticity and stability, frequently resulting in catastrophic forgetting issues. In this paper, to our best knowledge, we are the first to follow a significantly novel route, tackling the problem at the lowest level of abstraction. We propose a neuron model, referred to as Continual Memory Neuron (CMN), which does not only compute a response to an input pattern, but also diversifies computations to preserve what was previously learned, while being plastic enough to adapt to new knowledge. The values attached to weights are computed as a function of the neuron input, which acts as a query in a key-value map, with the goal of selecting and blending a set of learnable memory units. We show that this computational scheme is motivated by and strongly related to the ones of popular models that perform computations relying on a set of samples stored in a memory buffer, including Kernel Machines and Transformers. Experiments on class-and-domain incremental streams processed in online and single-pass manner support CMNs' capability to mitigate forgetting, while keeping competitive or better performance with respect to continual learning methods that explicitly store and replay data over time.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5137
Loading