UniLoRA: Unified LoRA Fine-tuning for Any-to-Any Machine Translation with Low-resource Corpus

UniLoRA: Unified LoRA Fine-tuning for Any-to-Any Machine Translation with Low-resource Corpus

ACL ARR 2025 May Submission1965 Authors

18 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: It is a great challenge for any-to-any machine translation in achieving robust performance across diverse language pairs, primarily due to the scarcity of non-English parallel corpora. Existing approaches often rely on massive multilingual datasets or cascaded translation pipelines, which introduce inefficiencies, computational costs, and error propagation. Models trained on limited language pairs struggle to generalize to unseen language directions, hindering practical deployment in real-world multilingual scenarios. To address these limitations, we propose UniLoRA, a novel instruction fine-tuning framework for Large Language Models (LLMs) that enables efficient any-to-any translation with minimal reliance on limited multilingual parallel data. Our approach leverages English-centric parallel corpora alongside limited multilingual translation examples to align cross-lingual representations, effectively bridging language gaps without requiring exhaustive language-specific supervision. UniLoRA employs parameter-efficient Low-Rank Adaptation (LoRA) modules alongside Mixture-of-Experts (MoE) framework to enable dynamic adaptation to arbitrary translation directions. Experiments demonstrate that our approach achieves competitive performance on diverse translation directions. This work provides a resource-efficient paradigm for democratizing high-quality any-to-any translation capabilities across linguistically diverse environments. Our code is available at https://anonymous.4open.science/r/UniL-1BD1/.

Paper Type: Long

Research Area: Machine Translation

Research Area Keywords: multilingual MT, efficient MT training, modeling

Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: Chinese, English, German, Russian, Czech, Icelandic

Keywords: multilingual machine translation, mixture-of-experts, parameter-efficient training, large language model

Submission Number: 1965

Loading