Are Small Language Models the Silver Bullet to Low-Resource Languages Machine Translation?

20 Sept 2025 (modified: 24 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Low-resource language, Small language models, Luxembourgish, Monolingual distillation
Abstract: Small language models (SLMs) represent parameter-efficient variants of large language models, designed to achieve computational efficiency while retaining core linguistic competencies. This study investigates the persistent challenges associated with translation performance in low-resource languages (LRLs) through a systematic evaluation of SLMs across 200 languages. In contrast to prior research, which has only marginally addressed LRL-oriented distillation, this work provides empirical evidence that transferring knowledge from large-scale teacher models to compact SLMs (2B/3B parameters) using predominantly monolingual LRL data yields substantial translation improvements, at times even surpassing models of up to 70B parameters. The primary contributions of this work can be summarized as follows: (1) the introduction of the first comprehensive quantitative benchmark evaluating SLMs over 200 languages with explicit emphasis on LRL limitations; (2) the demonstration that knowledge distillation for LRLs enhances translation quality without provoking catastrophic forgetting, while also elucidating key design priorities—prioritizing full-scale models over LoRA-based strategies, privileging data quality over data volume, and favoring decoder-only architectures as teachers over encoder–decoder frameworks; and (3) the confirmation of the robustness and transferability of these improvements across a wide spectrum of LRLs, thereby establishing a scalable and cost-effective methodology for addressing fairness disparities in multilingual translation. Overall, this study offers a rigorous validation of the feasibility and methodological best practices for applying SLMs in the context of LRLs, thereby laying an empirical foundation for their reliable deployment in low-resource language scenarios.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 24819
Loading