From Coefficients to Directions: Rethinking Model Merging with Directional Alignment

15 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: model merging
Abstract: Model merging is a recently emerging paradigm that aims to integrate multiple independently trained models into a single model without joint retraining. Previous studies have demonstrated the effectiveness of combining parameters from multiple models through strategies such as parameter decomposition, coefficient optimization and subspace learning. These methods significantly reduce the need for expensive joint training and have shown strong empirical performance across diverse tasks. However, these approaches predominantly treat merging as a problem of parameter space decomposition or fusion coefficient optimization, while largely overlooking the role of directional information in both parameter space and feature space. In practice, naïve merging introduces inconsistencies in dominant parameter directions and disrupts structural coherence across models, which can degrade performance. Moreover, coefficient-based optimization methods implicitly assume compatible feature-space directions across models. However, Neural Collapse indicates that class features follow structured directional patterns, which may differ across independently trained models, making coefficient optimization alone insufficient. In this work, we emphasize the importance of \emph{directional alignment} and introduce a unified geometric framework, \emph{Merging with Directional Alignment}~(MDA), which aligns directional structures consistently in both the parameter and feature spaces. Our theoretical analysis demonstrates that directional alignment improves structural coherence and tightens generalization bounds. Extensive experiments across benchmarks, model scales, and task configurations confirm the effectiveness of our approach.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 5546
Loading