Improving Training Efficiency via Language Specific Model Merging

Improving Training Efficiency via Language Specific Model Merging

ICLR 2026 Conference Submission19889 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, multilingual, model-merging, multitask, lora

TL;DR: This paper leverages merging techniques to combine independently trained language adapters to improve training efficiency

Abstract: Fine-tuning a task-specific multilingual large language model (LLM) involves training the model on a multilingual dataset with examples in all the required languages. Updating one or more supported languages with additional data or adding support for a new language involves retraining the model, which can be computationally inefficient. Recent research on merging multiple task-specific models has shown promise in terms of both computational efficiency and improved performance. These approaches only consider multiple tasks in a single language, but their effectiveness in merging language-specific models trained on a single task is underexplored. In this work, we explore existing model merging approaches in a multilingual setting for three independent tasks. Our experiments show that model merging approaches achieve performance on par with models trained on a combined dataset of multiple languages, as well as the language-specific fine-tuned models. Our analysis indicates that training efficiency improves by reducing the training time of adding or updating new languages by 2.5 times and reducing training costs by 3 times.

Primary Area: transfer learning, meta learning, and lifelong learning

Submission Number: 19889

Loading