Why Large Language Models Fail for Hausa Educational Content: Cascading Errors from Translation to Speech to Comprehension

Honour-Jesus Bezaleel; Pearse Jim; Moses Daudu

Why Large Language Models Fail for Hausa Educational Content: Cascading Errors from Translation to Speech to Comprehension

Honour-Jesus Bezaleel, Pearse Jim, Moses Daudu

Published: 02 Mar 2026, Last Modified: 08 Mar 2026ICLR 2026 Workshop ICBINBEveryoneRevisionsCC BY 4.0

Keywords: Large Language Models (LLMs), Hausa, Educational Technology, Machine Translation (MT), Automatic Speech Recognition (ASR), Low-Resource Languages, Error Correction, West African Senior School Certificate Examination (WAEC), LLM Cascading, Domain Adaptation.

TL;DR: This research tests whether LLMs can be used to fix errors and answer questions in educational context

Abstract: This research investigates the limitations of large language models (LLMs) in correcting errors from machine-translated text, transcribed speech, and in answering educational questions written in Hausa and translated from English. Despite continuous advances in LLM capabilities, we encounter persistent failures when applying these models to structured exam content, particularly due to omissions and distortions introduced during translation. Despite recent advances in LLM development, translating and processing educational examination text introduces persistent difficulties, including data scarcity, linguistic complexity, translation errors, aand lack of domain grounding. We investigate the performance of multiple LLMs on a newly curated dataset of West African Senior School Certificate Examination (WAEC) past questions, focusing on their ability to (1) correct errors in machine-translated and speech-synthesized Hausa text and (2) answer multiple-choice exam questions derived from translated content. We assess models including Llama-3.2-1B-Instruct, Gemma-2-2B-IT, N-ATLaS, and HausaLLaMA, and evaluate LLM cascading for error correction using larger models such as Llama-3.3-70B, Mixtral-8×7B, Gemini 2.0 Flash, and Flan-T5. Our findings reveal substantial performance gaps across all models. We argue that effective solutions require domain-specific fine-tuning and close collaboration with educators and native speakers in the creation of educational text and audio. By highlighting real-world challenges encountered when deploying LLM-based systems for low-resource educational settings, this research propose approach to overcome these barriers. This study provides insights into the real-world limitations of LLM-driven educational systems and suggests pathways toward more inclusive and reliable educational technology for underrepresented languages. Resolving this restraints, we offer inclusive educational knowledge dissemination. \end{abstract}

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 68

Loading