MLM: Multi-linguistic LoRA Merging

Jung Lee; Taero Kim; Nikhil Verma

MLM: Multi-linguistic LoRA Merging

Jung Lee, Taero Kim, Nikhil Verma

Published: 16 Oct 2025, Last Modified: 10 Nov 2025NeurIPS 2025 ER WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multilingual, Parameter efficient Fine-tuning, Model Merging, Parameter-Efficient Fine-tuning

Abstract: Large Language Models (LLMs) often struggle to generalize across languages, exhibiting strong task performance in high-resource settings but substantial degradation in low-resource ones. This performance gap arises not only from task-specific data scarcity, but also from the language imbalance embedded in pre-training. In this work, we propose Multi-Linguistic LoRA Merging (MLM), a modular fine-tuning framework that decouples task and language adaptation into two independently trained LoRA-based adapters: a Task Adapter (TA) trained on a high-resource language, and a Language Adapter (LA) trained on the target low-resource language while keeping both the TA and base model frozen. The LA serves as a linguistic bridge, aligning language-specific representations to the task logic encoded in the TA. The two adapters are then combined through simple parameter-space interpolation, followed by a lightweight post-merging alignment stage to refine their interaction. This design enables highly sample-efficient training, requiring only a single-epoch update for each new language adapter, while preserving strong task knowledge from the TA. We evaluate MLM on the MMLU-ProX benchmark across multiple train/test splits and model sizes (LLaMA-3.2 1B and 3B), demonstrating consistent improvements over strong baselines in both Spanish and Hindi transfer. Our results highlight that modular, decoupled adaptation provides an effective and scalable recipe for efficient multilingual fine-tuning.

Submission Number: 289

Loading