LLM Merging Competition Technical Report for NeurIPS 2024: Efficiently Building Large Language Models through Merging

Published: 12 Dec 2024, Last Modified: 12 Dec 2024LMC 2024 OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Model merging, Multi-task, Large language model
TL;DR: We experimented with a range of base models and merging strategies, ultimately choosing Llama3-8B-Instruct and its variants as our foundation model, merged using the DARE-TIES strategy.
Abstract: We present our solution for the LLM Merging Competition: Building LLMs Efficiently through Merging at NeurIPS 2024. We experimented with a range of base models and merging strategies, ultimately choosing Llama3-8B-Instruct and its variants as our foundation model, merged using the DARE-TIES strategy. To further improve inference-time performance, we incorporated few-shot enhancement and chain-of-thought prompting techniques. We secured 1st place on the released public dataset with a score of 0.83, and achieved a score of 0.41 in the Finals.
Submission Number: 2
Loading