Multimodal Question Generation and Evaluation Using Large Language Models

03 Oct 2025 (modified: 23 Dec 2025)Submitted to MMLoSo 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multimodal Models, LLMs, Question Generation, Automatic Evaluation, Children Books, Arabic
Abstract: To support the development of conversational agents for educational purposes, particularly those designed to engage children through interactive storytelling, there is a growing need for systems that can automatically generate relevant and pedagogically sound questions. Conversational agents can use such questions during interactive sessions to promote comprehension, reflection, and active participation. In this work, we develop an LLM-based pipeline that automates the generation of questions from story content, ensuring the appropriateness and clarity of questions to maximize children's learning outcomes. We use GPT-4o to generate interactive questions from stories based on various modality covering question types such as completion, recall, open-ended, and Wh questions. Our findings demonstrate the ability of the LLM to generate appropriate and contextually relevant questions, as well as its ability to align with human judgment in the evaluation of automatically generated questions.
Submission Number: 11
Loading