LexiMCH: A Bilingual Medical Knowledge Lexicon for Maternal and Child Healthcare in Low-Resource Languages and Healthcare Environments
Abstract: Maternal and child healthcare (MCH) in low-resource contexts faces persistent challenges due to linguistic and cultural barriers to accessing medical information. To address this, we develop a multilingual terminology resource focusing on English and Amharic, using a combination of machine translation, large language models (LLMs), and expert-in-the-loop validation. In this work, we evaluate a subset of 90 terms and definitions across multiple translation models, including Google Translate, NLLB-200, M2M100, and several LLM variants (GPT, LLaMA, Gemma, DeepSeek, Gemini, and Mistral). We use BLEU, chrF, and ROUGE-L metrics to assess translation quality for both terms and definitions. Preliminary results indicate variable performance across models, with DeepSeek-R1 achieving the highest BLEU scores (0.916 for definitions and 0.985 for terms) and LLM-assisted translations generally performing better on definitions than on terms. Ongoing work is extending the evaluation to the full dataset and further refining translation pipelines to produce a comprehensive, open-access, AI-ready resource for maternal and child healthcare in low-resource languages.
Submission Number: 54
Loading