YMCQ: Reasoning-Enhanced MCQ Generation

Published: 2025, Last Modified: 08 Jan 2026AIED (6) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Automated multiple-choice question (MCQ) generation is valuable for scalable assessment and enhanced learning experiences. However, existing MCQ generation methods face challenges in ensuring plausible distractors and maintaining answer consistency. This paper introduces a method for MCQ generation that integrates reasoning-based explanations for both correct answers and distractors, leveraging open-source language models finetuned on publicly available datasets. Our approach addresses these issues with 3 major improvements. First, over 300k questions from public datasets were augmented with synthetically generated reasoning explanations. Second, we fine-tune a Large Language Model (LLM) with reasoning-based explanations to condition the generation while accounting for correct reasoning and possible misconceptions. Third, we introduce a multi-step filtering pipeline to ensure the validity of the question and the diversity of the generated distractors. This work argues for the effectiveness of reasoning-enhanced finetuning in improving MCQ generation quality while maintaining accessibility and cost efficiency. We release all resources, including synthetically augmented questions, training code, and the best model as open-source.
Loading