Keywords: LLM, Model Merging, DARE, TIES, LoRA
Abstract: With the increasing training cost of large language models, model merging is attracting attention. This report describes our effort in the area during the NeurIPS 2024 LLM Merging Competition. We developed differentiable DARE-TIES, which optimizes the merging parameters in a differentiable manner. Whereas existing methods rely on black-box optimization algorithms, our method utilizes gradient descent and is expected to optimize high-dimensional merging parameters more efficiently. We conducted experiments to examine the potential of our approach.
Submission Number: 5
Loading