Keywords: Continual Learning
Abstract: Continual learning methods with fixed architectures rely on a single network to learn models that can perform well on all tasks.
As a result, they often only accommodate common features of those tasks but neglect each task's specific features. On the other hand, dynamic architecture methods can have a separate network for each task, but they are too expensive to train and not scalable in practice, especially in online settings.
To address this problem, we propose a novel online continual learning method named ``Contextual Transformation Networks” (CTN) to efficiently model the \emph{task-specific features} while enjoying neglectable complexity overhead compared to other fixed architecture methods.
Moreover, inspired by the Complementary Learning Systems (CLS) theory, we propose a novel dual memory design and an objective to train CTN that can address both catastrophic forgetting and knowledge transfer simultaneously.
Our extensive experiments show that CTN is competitive with a large scale dynamic architecture network and consistently outperforms other fixed architecture methods under the same standard backbone. Our implementation can be found at \url{https://github.com/phquang/Contextual-Transformation-Network}.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: This paper develops a novel method that can model task-specific features with minimal complexity overhead.
Data: [CIFAR-10](https://paperswithcode.com/dataset/cifar-10), [CIFAR-100](https://paperswithcode.com/dataset/cifar-100), [CORe50](https://paperswithcode.com/dataset/core50), [MNIST](https://paperswithcode.com/dataset/mnist), [mini-Imagenet](https://paperswithcode.com/dataset/mini-imagenet)
11 Replies
Loading