Keywords: Model Merging, Multi-task Learning, Efficiency, Personalization
Abstract: Model merging has emerged as a promising approach for enabling multi-task capabilities without additional training.
However, existing methods often suffer from substantial performance degradation compared to individual models, even on similar tasks, highlighting the importance of preserving task-specific information.
This paper introduces an approximation-based personalized merging method, Decomposition, Thresholding, and Scaling (DTS), which retains task-specific information with minimal storage overhead.
DTS first performs singular value decomposition on the task-specific information and preserves only a small subset of singular values and vectors.
It then applies a novel thresholding strategy to group the elements within each singular vector and computes a scaling factor for each group.
To further support generalization to unseen tasks, this paper extends DTS with a variant that leverages the semantic similarity of task characteristics to merge task-specific information in a data-free manner.
Extensive experiments demonstrate that DTS consistently outperforms state-of-the-art baselines, delivering superior performance with just 1% extra storage per task.
Furthermore, experiments on unseen tasks show that the DTS variant achieves significantly better generalization performance.
Our code is available in the supplementary materials.
Supplementary Material: zip
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 11814
Loading