TL;DR: We provide therotical insights into model merging in continual learning and propose the optimal solution for adaptive merging.
Abstract: Continual Learning (CL) strives to learn incrementally across tasks while mitigating catastrophic forgetting. A key challenge in CL is balancing stability (retaining prior knowledge) and plasticity (learning new tasks). While representative gradient projection methods ensure stability, they often limit plasticity. Model merging techniques offer promising solutions, but prior methods typically rely on empirical assumptions and carefully selected hyperparameters. In this paper, we explore the potential of model merging to enhance the stability-plasticity trade-off, providing theoretical insights that underscore its benefits. Specifically, we reformulate the merging mechanism using Bayesian continual learning principles and derive a closed-form solution for the optimal merging coefficient that adapts to the diverse characteristics of tasks. To validate our approach, we introduce a two-stage framework named BECAME, which synergizes the expertise of gradient projection and adaptive merging. Extensive experiments show that our approach outperforms state-of-the-art CL methods and existing merging strategies https://github.com/limei0818/BECAME.
Lay Summary: AI systems often struggle to learn new information without forgetting what they have already learned. This "catastrophic forgetting" makes it hard to balance remembering old tasks (stability) with adapting to new ones (plasticity). Existing solutions often make the AI too rigid to learn new things well, or require complex manual tuning to combine different approaches.
We explored a smarter way to merge models: one that remembers old tasks and another specialized for a new task. Instead of guesswork, we developed a method based on Bayesian principles to automatically find the ideal way to combine them, adapting to each task's unique characteristics. Our framework, BECAME, first carefully learns the new task while protecting old knowledge, then learns it more freely, and finally adaptively merges these two resulting models.
Our experiments show BECAME helps AI learn new skills more effectively with retaining old knowledge, outperforming current methods. This offers a more robust and principled way to build AI systems that can learn continuously, much like humans do.
Link To Code: https://github.com/limei0818/BECAME
Primary Area: General Machine Learning->Transfer, Multitask and Meta-learning
Keywords: Continual Learning, Incremental Learning, Model Merging
Submission Number: 70
Loading