RegMean++: Enhancing Effectiveness and Generalization of Regression Mean for Model Merging

TMLR Paper7393 Authors

07 Feb 2026 (modified: 21 Apr 2026)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Regression Mean (RegMean), an approach that formulates model merging as a linear regression problem, aims to find the optimal weights for each linear layer in the merge model by minimizing the discrepancy in predictions between the merge and candidate models. RegMean provides a precise closed-form solution for the merging problem; therefore, it offers explainability and computational efficiency. However, RegMean merges each linear layer independently, overlooking how the features and information in the earlier layers propagate through the layers and influence the final prediction in the merge model. Here, we introduce RegMean++, a simple yet effective alternative to RegMean, that explicitly incorporates both intra-layer and cross-layer dependencies between merge models' layers into RegMean's objective. By accounting for these dependencies, RegMean++ better captures the behaviors of the merge model. Extensive experiments demonstrate that RegMean++ consistently outperforms RegMean across diverse settings, including in-domain (ID) and out-of-domain (OOD) generalization, sequential merging, large-scale tasks, and robustness under several types of distribution shifts. Furthermore, RegMean++ achieves competitive or state-of-the-art performance compared to various recent advanced model merging methods.
Submission Type: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: In our revised manuscript, we have made the following key updates: (1) Clarified the merging process of RegMean++, and updated the algorithm's description, pseudocode, and diagram (Reviewer fZ8w). (2) Conducted experiments for three more baseline methods, and discussed the scaling constraints on language tasks (Reviewer NQcZ). Conducted additional experiments on language tasks with two Gemma 2 models. (3) Reduced the claim on RegMean++ achieves state-of-the-art performance because the results on language tasks are mixed. (4) Discussed the computational complexity of RegMean and RegMean++ (Reviewer k673). (5) Revised the introduction and motivation sections, and fixed the grammar issues (Reviewer k673 and Reviewer fZ8w). (6) Clarified the hyperparameter selection protocol for language tasks (Reviewer NQcZ).
Assigned Action Editor: ~Mohammad_Emtiyaz_Khan1
Submission Number: 7393
Loading