Abstract: Multi-task model merging offers a promising paradigm for integrating multiple expert models into a unified system without additional training. Existing state-of-the-art techniques, such as Task Arithmetic and its variants, merge models by accumulating task vectors—defined as the parameter differences between pre-trained and fine-tuned models. However, task vector accumulation is often hindered by knowledge conflicts, where conflicting components across different task vectors can lead to performance degradation during the merging process. To address this challenge, we propose **Conflict-Aware Task Merging (CAT Merging)**, a novel training-free framework that selectively trims conflict-prone components from the task vectors. CAT Merging introduces several parameter-specific strategies, including projection for linear weights and masking for scaling and shifting parameters in normalization layers. Extensive experiments on vision and vision-language tasks demonstrate that CAT Merging effectively suppresses knowledge conflicts, achieving average accuracy improvements of up to 4.7% (ViT-B/32) and 2.0% (ViT-L/14) over state-of-the-art methods.
Lay Summary: Modern machine learning systems are often customized for specific tasks, resulting in many separate models that are costly to store and maintain. Merging these task-specific models into a single, unified model would be ideal—but this often leads to interference between tasks, reducing overall performance.
To address this, we developed CAT Merging, a new technique that combines multiple models without any additional training. Unlike existing methods that simply average model parameters, CAT Merging identifies and removes the components most likely to cause conflicts between tasks.
Our method tailors the merging process to different types of model layers and uses lightweight operations guided by a few example inputs, making it both efficient and easy to apply in practice.
Primary Area: General Machine Learning->Transfer, Multitask and Meta-learning
Keywords: Multi-task Learning, Model Merging, Knowledge Conflict
Submission Number: 9043
Loading