Learning without Isolation: Pathway Protection for Continual Learning

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: Continual learning
Abstract:

Deep networks are prone to catastrophic forgetting during sequential task learning, i.e., losing the knowledge about old tasks upon learning new tasks. To this end, continual learning (CL) has emerged, whose existing methods focus mostly on regulating or protecting the parameters associated with the previous tasks. However, parameter protection is often impractical, since the size of parameters for storing the old-task knowledge increases linearly with the number of tasks, otherwise it is hard to preserve the parameters related to the old-task knowledge. In this work, we bring a dual opinion from neuroscience and physics to CL: in the whole networks, the pathways matter more than the parameters when concerning the knowledge acquired from the old tasks. Following this opinion, we propose a novel CL framework, learning without isolation (LwI), where model fusion is formulated as graph matching and the pathways occupied by the old tasks are protected without being isolated. Thanks to the sparsity of activation channels in a deep network, LwI can adaptively allocate available pathways for a new task, realizing pathway protection and addressing catastrophic forgetting in a parameter-effcient manner. Experiments on popular benchmark datasets demonstrate the superiority of the proposed LwI.

Lay Summary:

Deep learning models often struggle with "catastrophic forgetting"—when learning new tasks, they abruptly lose knowledge of previous ones. Inspired by neuroscience and physics, where knowledge relies more on pathways (how neurons connect) than individual neurons, we propose Learning without Isolation (LwI). Instead of isolating or freezing old parameters, LwI treats the model like a dynamic road network: it identifies and protects critical "pathways" for old tasks while flexibly assigning unused "routes" to new tasks. This sparse, adaptive approach avoids forgetting without wasting resources. Tested on standard benchmarks, LwI outperforms existing methods while using fewer parameters—like a brain efficiently organizing knowledge. This work bridges neuroscience, physics and AI, offering a scalable solution for lifelong learning systems.

Primary Area: Deep Learning->Everything Else
Keywords: Continual learning
Submission Number: 4773
Loading