Cross-Connected Mixture-of-Expert LoRA for Multilingual Neural Machine Translation

Cross-Connected Mixture-of-Expert LoRA for Multilingual Neural Machine Translation

ACL ARR 2025 May Submission4433 Authors

19 May 2025 (modified: 29 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Generative Large Language Models (LLMs) and the associated pre-training \& fine-tuning paradigms have achieved significant advancements in various NLP tasks. However, Multilingual Neural Machine Translation (MNMT) systems encounter capacity constraints when scaling to numerous languages with fixed model size, resulting in degraded translation quality, particularly for supervised tasks. Furthermore, the scarcity of parallel corpora for non-English language pairs limits expansion to new translation directions. This paper presents CrossLoRA, a novel MNMT framework that combines Low-Rank Adaptation (LoRA) with a Mixture-of-Experts (MoE) architecture featuring cross-connected language-specific experts. Our approach establishes dedicated experts for individual languages while enabling strategic interaction between source and target language experts during the translation process. To achieve any-to-any translation capability, we tailor a two-staged fine-tuning paradigm for CrossLoRA framework with a self-contrastive semantic enhancement, fine-tuning using English as the pivot language, followed by pseudo-corpus generation and subsequent fine-tuning with the generated data. Experimental results on multilingual translation datasets confirm the quality improvement and parameter efficiency of CrossLoRA framework. Our findings provide an effective recipe for fine-tuning LLMs to achieve any-to-any translation capability. Our code is available at: https://anonymous.4open.science/r/CrossL-3FBF/.

Paper Type: Long

Research Area: Machine Translation

Research Area Keywords: efficient MT training, multilingual MT

Contribution Types: Approaches to low-resource settings

Languages Studied: Chinese, English, German, Russian, Czech, Icelandic

Keywords: multilingual machine translation, mixture-of-experts, parameter-efficient training, large language model

Submission Number: 4433

Loading