Abstract: In lifelong learning, data are used to improve performance not only on the current task, but also on previously encountered, and as yet unencountered tasks. While typical transfer learning algorithms can improve performance on future tasks, their performance on prior tasks degrades upon learning new tasks (called forgetting). Many recent approaches for continual or lifelong learning have attempted to maintain performance on old tasks given new tasks. But striving to avoid forgetting sets the goal unnecessarily low. The goal of lifelong learning should be to use data to improve performance on both future tasks (forward transfer) and past tasks (backward transfer). Our key insight is that we can ensemble representations that were learned independently on disparate tasks to enable both forward and backward transfer, with algorithms that run in quasilinear time. Our algorithms demonstrate both forward and backward transfer in a variety of simulated and benchmark data scenarios, including tabular, vision (CIFAR-100, 5-dataset, Split Mini-Imagenet, and Food1k), and audition (spoken digit), including adversarial tasks, in contrast to various reference algorithms, which typically failed to transfer either forward or backward, or both.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=4ID37Uv64p
Changes Since Last Submission: We have addressed the concerns from the reviewers and pruned the main content down to 12.5 pages with major changes in the figures. We also added a new baseline called CoSCL as suggested by the reviewers.
Assigned Action Editor: ~Sungwoong_Kim2
Submission Number: 1318
Loading