Task-Aware Model Merging via Fisher-Weighted Median

Task-Aware Model Merging via Fisher-Weighted Median

ACL ARR 2026 January Submission5546 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: model merging, large language models

Abstract: Fine-tuning large language models provides strong in-domain performance but limits generalization and requires storage of many specialized models. Retraining a unified multitask model is often infeasible due to data unavailability or high computational cost. The majority of model merging approaches rely on performing arithmetic operations directly on model parameters. Although research in model merging has expanded significantly in recent years, two distinct approaches have become dominant: 1) techniques that mitigate interference from redundant parameters and sign conflicts, and 2) techniques that account for the varying sensitivity of individual parameters. However, these two approaches operate independently without considering each other's strengths and remain disconnected from each other. In this work, we aim to unify these two well-established yet currently disconnected approaches by integrating insights from both the approaches. We propose DRIFT-MEDIAN, a Fisher-aware model merging method that assigns sensitivity-weighted task vectors through a closed-form Fisher-weighted median, ensuring that task-relevant parameters dominate the merged model. Comprehensive experiments on several LLMs and CLIP models demonstrate that DRIFT-MEDIAN outperforms existing model merging methods.

Paper Type: Long

Research Area: Language Models

Research Area Keywords: model editing , robustness

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: English

Submission Number: 5546

Loading