Keywords: question generation, education question generation, topic controlled question generation, knowledge distillation, small langauge model, multilingual question generation, ai for education
TL;DR: We show knowledge distillation on SLM as viable pathway, competitive to LLMs, for training multilingual education question generation
Abstract: Question Generation (QG) is central to Intelligent Tutoring Systems, but routine Large Language Model (LLM) inference is difficult to deploy in resource-constrained educational settings, including much of the Global South. We study whether sequence-level knowledge distillation can transfer multilingual, topic-controlled QG from a teacher LLM into a deployable Small Language Model (SLM) without commissioning per-language datasets. Using XQuAD contexts in 11 languages, Google Gemini 2.5 Flash generates synthetic training pairs and an mT5-small student is fine-tuned on the result. Against zero-shot LLMs and a bespoke augmented mT5 baseline, the distilled student retains 86--87\% of the teacher's lexical and WikiSemRel topic-control performance while using a far smaller model. The results suggest that distillation is a practical pathway for multilingual educational NLP under limited data and compute.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 93
Loading