Gram-weighted Mahalanobis Fréchet Mean: A Hyperparameter-Tuning-Free Solution for Model Merging

Wenju Sun; Yangliao Geng; Wen Wang; Yang Liu; Qingyong Li; Boyang Li

Gram-weighted Mahalanobis Fréchet Mean: A Hyperparameter-Tuning-Free Solution for Model Merging

Wenju Sun, Yangliao Geng, Wen Wang, Yang Liu, Qingyong Li, Boyang Li

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Model Merging, Multi-task Learning, Task Arithmetic

Abstract: Model merging has emerged as a promising technique for integrating multiple fine-tuned models into a single unified model without additional training. This paradigm is particularly appealing in resource-constrained scenarios where access to data or retraining is limited. Existing techniques—such as Task Arithmetic, Ties-Merging, and AdaMerging—achieve competitive results but typically rely on extensive hyperparameter tuning, which can be prohibitively expensive for large-scale models. In this work, we propose a hyperparameter-robust merging method that reframes the problem as the estimation of a unified task vector that captures the principal directions of each task (i.e., dominant singular vectors). We formalize this process as the Gram-weighted Mahalanobis Fréchet mean (GMF-Mean), a convex optimization problem that admits a closed-form solution. Our theoretical analysis shows that GMF-Mean inherently adapts to both orthogonal (non-interfering) and conflicting (collinear but opposing) task interactions by automatically modulating the magnitudes of the principal directions. This property alleviates the need for costly hyperparameter tuning that is commonly required in Task Arithmetic-based methods. Empirical results on vision, language, and vision-language models show that GMF Mean achieves competitive performance compared to state-of-the-art baselines, while maintaining the advantages of being training-free, data-free, and hyperparameter-robust. These properties position GMF-Mean as a robust solution for real-world deployment.

Supplementary Material: zip

Primary Area: transfer learning, meta learning, and lifelong learning

Submission Number: 10810

Loading