BloomXplain: A framework and dataset for pedagogically sound LLM-generated explanations based on the Bloom's Taxonomy

ACL ARR 2025 May Submission1911 Authors

18 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The ability of Large Language Models (LLMs) to generate accurate and pedagogically sound instructional explanations is a sine qua non for their effective deployment in educational applications, such as AI tutors and teaching assistants. However, little research has systematically evaluated their performance across varying levels of cognitive complexity. Believing that such a direction serves the dual goal of not only producing more educationally sound and human-aligned outputs, but also fostering more robust reasoning and, thus, leading to more accurate results, we introduce BloomXplain, a framework designed to generate and assess LLM-generated instructional explanations across Bloom’s Taxonomy levels. We first construct a STEM-focused dataset of question–answer pairs categorized by Bloom’s cognitive levels, filling a key gap in NLP resources. Using this dataset and widely used benchmarks, we benchmark multiple LLMs with diverse prompting techniques, assessing correctness, alignment with Bloom's Taxonomy and pedagogical soundness. Our findings show that BloomXplain not only produces more pedagogically grounded outputs but also achieves accuracy on par with, and sometimes exceeding, existing approaches. This work sheds light on the strengths and limitations of current models and paves the way for more accurate and interpretable results.
Paper Type: Long
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: explanation faithfulness, free-text/natural language explanations
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources
Languages Studied: English
Submission Number: 1911
Loading