AI in Yoruba STEM Education for Early Childhood Learning: A Study on Translation Quality and Context

Published: 06 Mar 2025, Last Modified: 10 Apr 2025ICLR 2025 Workshop AI4CHL PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Full paper
Keywords: NLP, Neural Machine Translation, Early Child Learning, Underrepresented Languages, STEM Education, Yoruba STEM Dataset Curation, Language Modeling, Evaluating LLMs
TL;DR: Evaluating AI translation models for Yoruba STEM education, highlighting challenges in accuracy and the need for domain-specific fine-tuning
Abstract:

The integration of AI translation models into education is essential for expand-ing access to learning materials in low-resource languages like Yorub´a, spoken by more than 45 million people in Nigeria and beyond. Early childhood education in a child’s native language is crucial to cognitive and academic development, sig- nificantly enhancing learning outcomes. However, assessing the contextual accu- racy and domain-specific relevance of AI-generated translations remains a critical challenge. This study evaluates the translation quality of state-of-the-art multi-lingual models: Llama, Claude, DeepSeek and AfriTeVa for STEM education in Yorub´a. We used a curated STEM dataset digitized from textbooks and translated by expert linguists. Next, we evaluated AI-generated translations against human references using BLEU, CHRF, TER, and AfriCOMET. To ensure a fair com- parison, all evaluation scores were normalized using the Min Max normalization method. Our results reveal significant gaps in AI-generated STEM translations. Although models like DeepSeek and AfriTeVa perform relatively well in lexical accuracy and fluency, they struggle with domain-specific terminology and contextual integrity. Claude and Llama show higher BLEU scores, but do not maintain scientific accuracy and pedagogical relevance. The findings underscore the need to fine-tune AI models specifically for Yorub´a STEM translation to improve contextual understanding and technical accuracy. Future work should focus on developing domain-adapted, culturally aware translation models and enhancing Yorub´a STEM datasets to ensure AI tools better support early childhood STEM education in underrepresented languages.

Submission Number: 30
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview