Unconstrained Model Fusion for Enhanced LLM Reasoning

Unconstrained Model Fusion for Enhanced LLM Reasoning

ACL ARR 2024 December Submission2162 Authors

16 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Domain-specific large language models (LLMs) have gained success in respective areas. However, achieving equivalently remarkable performances with an all-in-one model remains challenging due to the need for proprietary data and high computational costs. In this work, we propose a resource-friendly unconstrained framework to fuse multiple expert models into a single LLM focusing on reasoning tasks. It overcomes the limitations of model architecture and size, which often required unification in previous studies. Specifically, homogeneous models are integrated by merging through a fine-grained layer-wise weight strategy, while heterogeneous model are integrated by fusion built upon the probabilistic distribution knowledge derived from instruction-response fine-tuning data. We verify the effectiveness of our method across 7 benchmarks and 9 reasoning-optimized LLMs. Results show that the merged model displays composite reasoning capabilities of logical inference over complex relationships and multi-step problem solving. The proposed unconstrained model-merging framework can serve as a foundation for decentralized LLMs, enhancing wider participation, and stimulating additional advancement in the field of artificial intelligence. Our models will be open-source at https://anonymous.4open.science/status/Model-031Fusion-D853.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: continual learning，applications，fine-tuning

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 2162

Loading