LLM Merging Competition Technical Report for NeurIPS 2024: Efficiently Building Large Language Models through Merging
Keywords: Model merging, Multi-task, Large language model
TL;DR: We experimented with a range of base models and merging strategies, ultimately choosing Llama3-8B-Instruct and its variants as our foundation model, merged using the DARE-TIES strategy.
Abstract: We present our solution for the LLM Merging Competition: Building LLMs Efficiently through Merging at NeurIPS 2024. We experimented with a range of base models and merging strategies, ultimately choosing Llama3-8B-Instruct and its variants as our foundation model, merged using the DARE-TIES strategy. To further improve inference-time performance, we incorporated few-shot enhancement and chain-of-thought prompting techniques. We secured 1st place on the released public dataset with a score of 0.83, and achieved a score of 0.41 in the Finals.
Submission Number: 2
Loading