CURE-Med: Curriculum-Informed Reinforcement Learning for Multilingual Medical Reasoning

CURE-Med: Curriculum-Informed Reinforcement Learning for Multilingual Medical Reasoning

ACL ARR 2026 January Submission4065 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multilingual AI, Medical Reasoning, Reinforcement Learning, Curriculum Learning, Multilingual NLP, Low-Resource Languages, Biomedical NLP, AI for Healthcare

Abstract: While large language models (LLMs) have shown to perform well on monolingual mathematical and commonsense reasoning, they remain unreliable for multilingual medical reasoning applications, hindering their deployment in multilingual healthcare settings. We address this by first introducing CURE-Med-Bench, a high-quality multilingual medical reasoning dataset with open-ended reasoning queries with a single verifiable answer, spanning thirteen languages, including underrepresented languages such as Amharic, Yoruba, and Swahili. Building on this dataset, we propose CURE-Med, a curriculum-informed reinforcement learning framework that integrates code-switching-aware supervised fine-tuning and Group Relative Policy Optimization to jointly improve logical correctness and language stability. Across thirteen languages, our approach consistently outperforms strong baselines and scales effectively, achieving 85.21\% language consistency and 54.35\% logical correctness at 7B parameters, and 94.96\% language consistency and 70.04\% logical correctness at 32B parameters. These results support reliable and equitable multilingual medical reasoning in LLMs. The code and dataset will be made publicly available upon acceptance.

Paper Type: Long

Research Area: Clinical and Biomedical Applications

Research Area Keywords: Multilingualism and Cross-Lingual NLP, NLP Applications, Language Modeling, Efficient/Low-Resource Methods for NLP

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Publicly available software and/or pre-trained models, Data resources

Languages Studied: Amharic (Am), Hausa (Ha), Swahili (Sw), Yoruba (Yo), Bengali (Bn), Hindi (Hi), French (Fr), Spanish (Es), Turkish (Tr), Vietnamese (Vi), Thai (Th), Japanese (Ja), Korean (Ko)

Submission Number: 4065

Loading