Simple Llama Merge: What Kind of LLM Do We Need?

Published: 12 Dec 2024, Last Modified: 12 Dec 2024LMC 2024 OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Model,Model Merging
Abstract: Model merging involves integrating multiple specialized models into a single, more powerful model. This approach provides several advantages, including decreased storage and serving costs, enhanced generalization capabilities, and facilitation of decentralized model development. The question of how to effectively combine specialized fine-tuned small models (8B parameters) to achieve performance levels comparable to those of larger models remains an unresolved issue. Therefore, this paper describes our method for the simple merging of models from the LLaMA family. The resulting model is capable of producing complete, instruction-compliant, and highly accurate answers to questions across multiple domains. It achieved 2nd place on the final test of the LLM Merging Competition. For detailed implementation, please refer to our GitHub repository at: \url{https://github.com/Catrin-baze/llama-merging}
Submission Number: 3
Loading