Unconstrained Model Fusion for Enhanced LLM Reasoning

ACL ARR 2024 December Submission2162 Authors

16 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Domain-specific large language models (LLMs) have gained success in respective areas. However, achieving equivalently remarkable performances with an all-in-one model remains challenging due to the need for proprietary data and high computational costs. In this work, we propose a resource-friendly unconstrained framework to fuse multiple expert models into a single LLM focusing on reasoning tasks. It overcomes the limitations of model architecture and size, which often required unification in previous studies. Specifically, homogeneous models are integrated by merging through a fine-grained layer-wise weight strategy, while heterogeneous model are integrated by fusion built upon the probabilistic distribution knowledge derived from instruction-response fine-tuning data. We verify the effectiveness of our method across 7 benchmarks and 9 reasoning-optimized LLMs. Results show that the merged model displays composite reasoning capabilities of logical inference over complex relationships and multi-step problem solving. The proposed unconstrained model-merging framework can serve as a foundation for decentralized LLMs, enhancing wider participation, and stimulating additional advancement in the field of artificial intelligence. Our models will be open-source at https://anonymous.4open.science/status/Model-031Fusion-D853.
Paper Type: Long
Research Area: Language Modeling
Research Area Keywords: continual learning,applications,fine-tuning
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 2162
Loading