RegMean++: Enhancing Effectiveness and Generalization of Regression Mean for Model Merging

The-Hai Nguyen; Dang Huu-Tien; Takeshi Suzuki; Le-Minh Nguyen

RegMean++: Enhancing Effectiveness and Generalization of Regression Mean for Model Merging

The-Hai Nguyen, Dang Huu-Tien, Takeshi Suzuki, Le-Minh Nguyen

Published: 26 Apr 2026, Last Modified: 26 Apr 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Regression Mean (RegMean), an approach that formulates model merging as a linear regression problem, aims to find the optimal weights for each linear layer in the merged model by minimizing the discrepancy in predictions between the merged and candidate models. RegMean provides a precise closed-form solution for the merging problem; therefore, it offers explainability and computational efficiency. However, RegMean merges each linear layer independently, overlooking how the features and information in earlier layers propagate through deeper layers and influence the final predictions of the merged model. Here, we introduce RegMean++, a simple yet effective alternative to RegMean, that explicitly incorporates both intra-layer and cross-layer dependencies between merged models' layers into RegMean's objective. By accounting for these dependencies, RegMean++ better captures the behaviors of the merged model. Extensive experiments demonstrate that RegMean++ consistently outperforms RegMean across diverse settings, including in-domain (ID) and out-of-domain (OOD) generalization, sequential merging, large-scale tasks, and robustness under several types of distribution shifts. Furthermore, RegMean++ achieves competitive performance across diverse settings compared to various advanced model merging methods.

Submission Type: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: In our revised manuscript, we have made the following key updates: (1) Clarified the merging process of RegMean++, and updated the algorithm's description, pseudocode, and diagram (Reviewer fZ8w). (2) Conducted experiments for three more baseline methods, and discussed the scaling constraints on language tasks (Reviewer NQcZ). Conducted additional experiments on language tasks with two Gemma 2 models. (3) Reduced the claim on RegMean++ achieves state-of-the-art performance because the results on language tasks are mixed. (4) Discussed the computational complexity of RegMean and RegMean++ (Reviewer k673). (5) Revised the introduction and motivation sections, and fixed the grammar issues (Reviewer k673 and Reviewer fZ8w). (6) Clarified the hyperparameter selection protocol for language tasks (Reviewer NQcZ).

Code: https://github.com/nthehai01/RegMean-plusplus

Supplementary Material: zip

Assigned Action Editor: ~Mohammad_Emtiyaz_Khan1

Submission Number: 7393

Loading