Edustories: A Collection of Real-world Case Studies from Classroom Practices

ACL ARR 2026 January Submission4470 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: dataset, education, teaching assistance, benchmark
Abstract: Despite the widely recognized potential of AI applications in education, most prior work has focused on individualized student assistance. In contrast, the majority of educational practice worldwide still takes place in collective classroom settings. To enable AI research grounded in classroom practice, we present Edustories, a dataset of 1,492 teacher-written case studies describing real elementary and high-school classroom situations involving challenging student behavior, pedagogical interventions, and their outcomes. Among its applications, Edustories enables a systematic evaluation of large language models in their ability to provide practising teachers with feedback on their interventions. By comparing the latest models from four language-model families with standardized expert assessments, we demonstrate that current models fall short of human expertise in predicting classroom outcomes; The strongest models reach 58\% accuracy compared to 64\% of human experts, underscoring the limitations but also the emerging potential of current AI as teaching assistants rather than replacements.
Paper Type: Short
Research Area: Resources and Evaluation
Research Area Keywords: NLP datasets, benchmarking,evaluation
Contribution Types: Data resources
Languages Studied: Czech,English
Submission Number: 4470
Loading