Bridging the Culture Gap: A Framework for LLM-Driven Socio-Cultural Localization of Math Word Problems in Low-Resource Languages
Keywords: african languages, mathematical word problems, math, large language models, multilinguality
Abstract: Large language models (LLMs) have demonstrated significant capabilities in solving mathematical problems expressed in natural language. However, multilingual and culturally-grounded mathematical reasoning in low-resource languages lags behind English due to the scarcity of socio-cultural task datasets that reflect accurate native entities such as person names, organization names, and currencies. Existing multilingual benchmarks are predominantly produced via translation and typically retain English-centric entities, owing to the high cost associated with human annotater-based localization. Moreover, automated localization tools are limited, and hence, truly localized datasets remain scarce. In this work, we study the cultural robustness of large language models by examining the impact of culturally specific entities and the biases introduced by English-centric benchmarks. To bridge this gap, we introduce a framework for LLM-driven cultural localization of math word problems that automatically constructs datasets with native names, organizations, and currencies from existing sources. We find that translated benchmarks can obscure true multilingual math ability under appropriate socio-cultural contexts. Through extensive experiments, we also show that our framework can help mitigate English-centric entity bias and improves robustness when native entities are introduced across various languages.
Paper Type: Long
Research Area: Low-resource Methods for NLP
Research Area Keywords: data augmentation, LLM Efficiency,robustness, multilingual / low resource,mathematical NLP, datasets for low resource languages
Contribution Types: Approaches to low-resource settings, Publicly available software and/or pre-trained models, Data analysis
Languages Studied: Hausa, Swahili, Ewe, Twi, Wolof, Lingala, Luganda, Oromo, Shona, Xhosa, Yoruba, Kinyarwanda, Zulu, Sotho, Igbo, Amharic, French, English
Submission Number: 3577
Loading