Quasi-Orthogonal Model Merging for Continual Learning

18 Sept 2025 (modified: 13 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Continual Learning, Model Merging
Abstract: Continual learning (CL) seeks to enable models to acquire new tasks sequentially without overwriting prior knowledge. Recently, model merging has emerged as a promising paradigm, where task vectors, i.e., parameter updates induced by fine-tuning, are combined across tasks. However, naive sequential merging often suffers from interference when task vectors overlap in conflicting directions. We introduce Quasi-Orthogonal Model Merging (QOMM), a unified framework that mitigates such interference through two complementary strategies. First, QOMM employs Singular Value Decomposition (SVD) to extract the dominant subspace of previously merged task vectors, and projects each new vector onto its approximate orthogonal complement. This Quasi-Orthogonal Projection (QOP) filters out conflicting directions, reducing interference. Second, QOMM integrates Attention-Exclusive Fine-Tuning (AEFT), which restricts updates to Transformer attention layers. This yields task vectors that are naturally more orthogonal, enhancing the effectiveness of QOP. By combining orthogonality-aware merging with attention-exclusive fine-tuning, QOMM achieves a better balance between stability (retaining past knowledge) and plasticity (adapting to new tasks). Experiments on standard CL benchmarks demonstrate that QOMM consistently outperforms prior methods. Our code will be released.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 11188
Loading