Null-Space Filtering for Data-Free Continual Model Merging: Preserving Stability, Promoting Plasticity
Keywords: Continual Model Merging, Model Merging
TL;DR: NUFILT is a data-free continual model merging framework using null-space filtering and projection-aware adaptation, achieving transparency, fidelity, and 4–7% accuracy gains over baselines with minimal forgetting and lower overhead.
Abstract: Data-free continual model merging (DFCMM) aims to fuse independently fine-tuned models into a single backbone that evolves with incoming tasks without accessing task data.
This paper revisits two fundamental desiderata for DFCMM: stability, avoiding interference with earlier tasks, and plasticity, adapting faithfully to each new task. This poses a challenge that existing approaches fail to address: how to bridge data-level desiderata with parameter-space optimization to ensure stability and plasticity in the absence of task data.
To this end, we propose NUFILT(NUll-space FILTering) a data-free framework that directly links these desiderata into parameter-space optimization. Our key observation is that task vectors approximately align with representation subspaces, providing structural surrogates for enforcing stability and plasticity.
Accordingly, we design a null-space projector that preserves prior responses by filtering overlapping components of new task vectors, ensuring stability.
We further introduce a lightweight LoRA adapter that injects complementary task-specific signals to enable plasticity.
The adapter is trained with a projection-based surrogate loss that preserves consistency with prior knowledge while introducing novel directions.
This joint filtering–adaptation process enables the backbone to absorb new knowledge while retaining existing behaviors, with updates fused back in a layer-wise linear fashion without extra parameters or inference cost.
Theoretically, we establish approximate subspace alignment guarantees that justify null-space filtering. Empirically, NUFILT achieves state-of-the-art performance with minimal forgetting on both vision and NLP benchmarks, improving average accuracy by 4–7% over OPCM and WUDI-Merging, while narrowing the gap to fine-tuning and reducing computation overhead. The code is available at: https://github.com/zihuanqiu/NUFILT
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 1855
Loading