Abstract: Accurate machine translation (MT) is essential for medical communication, particularly in low-resource languages like Luganda. However, existing models struggle with clinical precision, terminology consistency, and cultural adaptation. This study evaluates the performance of transformer-based MT models—MarianMT, NLLB-200, M2M-100, Mistral-7B, Google Translate, and fine-tuned medical models—on English–Luganda medical translation, with a focus on malaria diagnostics and community health communication. We introduce a clinician-validated parallel corpus and employ a hybrid evaluation framework combining BLEU, METEOR, TER, and direct expert assessments to measure clinical adequacy.
Fine-tuning NLLB-1.3B with LoRA demonstrated significant improvements, achieving the highest BLEU and METEOR scores while reducing computational costs. However, error analysis revealed persistent challenges in terminology alignment and contextual accuracy. Our findings highlight the limitations of generic MT models for medical use and emphasize the need for domain adaptation strategies. Future work will focus on expanding expert-driven evaluations, integrating human-in-the-loop feedback, and optimizing model architectures to enhance medical MT reliability in clinical settings.
Paper Type: Long
Research Area: Machine Translation
Research Area Keywords: machine translation , low-resourced languages , machine translation evaluation , Luganda machine translation , medical machine translation
Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency, Data resources
Languages Studied: English , Luganda
Submission Number: 3817
Loading