AI-Driven Generation and Evaluation of a Personalized AP Chemistry Question Framework Using Item Response Theory: A Computational Proof-of-Concept

14 Sept 2025 (modified: 08 Oct 2025)Submitted to Agents4ScienceEveryoneRevisionsBibTeXCC BY 4.0
Keywords: AI, Agent4Science, ML, LLM, IRT, AI in education
TL;DR: In a large-scale simulation, an AI agent using LLMs to generate AP Chemistry questions and Item Response Theory (IRT) to personalize them significantly improved virtual student scores, especially for lower-to-average ability learners.
Abstract: This paper presents and evaluates a methodological blueprint for an adaptive learning system that integrates LLM-generated content with Item Response Theory (IRT). We conducted a large-scale simulation (N=10,000) where an agent administered a 30-item AP Chemistry question bank to virtual students, personalizing the question sequence based on real-time ability estimates. Our IRT-personalized agent achieved a statistically significant and robust performance gain over both random selection and fixed-difficulty baselines ($p < 0.0001$). A detailed subgroup analysis revealed that this performance lift extended across all learner ability levels and was most pronounced for students in the lower-to-average ability quartiles. Our findings validate the effectiveness of this synergistic framework and provide a strong, reproducible baseline justifying future human-subject experiments.
Submission Number: 149
Loading