ImpressLearn: Continual Learning via Combined Task Impressions

TMLR Paper545 Authors

26 Oct 2022 (modified: 17 Sept 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: This work proposes a new method to sequentially train deep neural networks on multiple tasks without suffering catastrophic forgetting, while endowing it with the capability to quickly adapt to unseen tasks. Starting from existing work on network masking (Wortsman et al., 2020), we show that simply learning a linear combination of a small number of task-specific supermasks (impressions) on a randomly initialized backbone network is sufficient to both retain accuracy on previously learned tasks, as well as achieve high accuracy on unseen tasks. In contrast to previous methods, we do not require to generate dedicated masks or contexts for each new task, instead leveraging transfer learning to keep per-task parameter overhead small. Our work illustrates the power of linearly combining individual impressions, each of which fares poorly in isolation, to achieve performance comparable to a dedicated mask. Moreover, even repeated impressions from the same task (homogeneous masks), when combined, can approach the performance of heterogeneous combinations if sufficiently many impressions are used. Our approach scales more efficiently than existing methods, often requiring orders of magnitude fewer parameters and can function without modification even when task identity is missing. In addition, in the setting where task labels are not given at inference, our algorithm gives an often favorable alternative to the one-shot procedure used by Wortsman et al. (2020). We evaluate our method on a number of well-known image classification datasets and network architectures.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: The revised version features substantial changes that concern presentation, writing, and organization. We addressed requested changes and, among many others, made the following changes: I. Regenerated figures in better style, quality, and with explicit legend. We split figures into parts where appropriate and wrote concise and clear captions. We shortened Table 1 to reduce unnecessary clutter and added statistics about the number of tasks and classes per task, as requested. II. Approach, Experimental result, and Discussion (Sections 3, 4, and 5) underwent significant editing that improved presentation, clarified previously confusing statements, and provided better overview and details about our approach and its comparison to baselines. A short paragraph about the edge-popup algorithm is added in Section 3. Additional directions for future research are identified in Section 5. Mathematical notation and formulas in Section 3 are simplified and clarified. A new study is added to Section 2. Some clarifications and minor changes are added to Section 1. III. Added analysis of the effect of task order on the performance of ImpressLearn in Appendix 4. There are many more miscellaneous edits that improve readability and presentation.
Assigned Action Editor: ~Charles_Xu1
Submission Number: 545
Loading