Keywords: Model Merging, Router, LLM, Training Free
TL;DR: We propose a training-free model merging method specifically for router-based model merging.
Abstract: With the rapid advancement of deep learning, a wide variety of open-source models for different tasks have emerged. However, a single fine-tuned model often fails to meet users' diverse requirements. To address this limitation, model merging has been proposed as an effective approach to integrate the capabilities of existing models into a unified one. Among existing approaches, router-based methods have become representative baselines due to their strong performance; however, their reliance on a trainable router compromises the appealing advantage of traditional model merging—being completely training-free.In this paper, we propose a training-free router from a similarity-based perspective. Our method achieves performance on par with router-based approaches while eliminating the need for any additional training. We demonstrate the effectiveness of TR-Merging across multiple tasks in both computer vision (CV) and natural language processing (NLP), and demonstrate its flexibility in adapting to diverse requirements.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 9563
Loading