Differentiable Prototypes with Distributed Memory Network for Continual Learning

Published: 01 Jan 2024, Last Modified: 05 Nov 2025HAIS (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Deep learning models demonstrate outstanding performance across various fields but need to continuously learn new knowledge while preserving previously learned one. The main challenge in continual learning is catastrophic forgetting caused by semantic drift. Existing approaches to this problem often store portions of previously learned data or prototypes, but their effectiveness heavily depends on heuristics, and storage requirements grow rapidly with increasing data. In this paper, we propose a new method that enables end-to-end learning of prototypes. This method selects the prototypes with a distributed memory network to eliminate the dependency on specific algorithms and allow automatic updates to prevent obsolescence. The distributed storage of prototypes spreads information widely across memory, which is controlled with feature compactness and separateness to minimize the memory usage. To prevent the semantic shift, encoded features are stored in the buffer and a decoder reconstructs the data. Experiments on continual learning setup with MNIST, permuted MNIST, and fashion-MNIST show the competitive performance against the state-of-the-art models. For a buffer size of 500, performance improvement reaches to 0.04%p, 1.45%p, and 5.4%p for MNIST, permuted MNIST, and fashion-MNIST, respectively.
Loading