Efficient Lifelong Learning with A-GEMDownload PDF

Published: 21 Dec 2018, Last Modified: 21 Apr 2024ICLR 2019 Conference Blind SubmissionReaders: Everyone
Abstract: In lifelong learning, the learner is presented with a sequence of tasks, incrementally building a data-driven prior which may be leveraged to speed up learning of a new task. In this work, we investigate the efficiency of current lifelong approaches, in terms of sample complexity, computational and memory cost. Towards this end, we first introduce a new and a more realistic evaluation protocol, whereby learners observe each example only once and hyper-parameter selection is done on a small and disjoint set of tasks, which is not used for the actual learning experience and evaluation. Second, we introduce a new metric measuring how quickly a learner acquires a new skill. Third, we propose an improved version of GEM (Lopez-Paz & Ranzato, 2017), dubbed Averaged GEM (A-GEM), which enjoys the same or even better performance as GEM, while being almost as computationally and memory efficient as EWC (Kirkpatrick et al., 2016) and other regularization-based methods. Finally, we show that all algorithms including A-GEM can learn even more quickly if they are provided with task descriptors specifying the classification tasks under consideration. Our experiments on several standard lifelong learning benchmarks demonstrate that A-GEM has the best trade-off between accuracy and efficiency
Keywords: Lifelong Learning, Continual Learning, Catastrophic Forgetting, Few-shot Transfer
TL;DR: An efficient lifelong learning algorithm that provides a better trade-off between accuracy and time/ memory complexity compared to other algorithms.
Code: [![github](/images/github_icon.svg) facebookresearch/agem](https://github.com/facebookresearch/agem) + [![Papers with Code](/images/pwc_icon.svg) 1 community implementation](https://paperswithcode.com/paper/?openreview=Hkf2_sC5FX)
Data: [ASC (TIL, 19 tasks)](https://paperswithcode.com/dataset/asc-til-19-tasks), [AwA](https://paperswithcode.com/dataset/awa-1), [CIFAR-100](https://paperswithcode.com/dataset/cifar-100), [CUB-200-2011](https://paperswithcode.com/dataset/cub-200-2011)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:1812.00420/code)
11 Replies