Human and LLM-based Assessment of Teaching Acts in Expert-led Explanatory Dialogues

Human and LLM-based Assessment of Teaching Acts in Expert-led Explanatory Dialogues

ACL ARR 2025 February Submission873 Authors

11 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Didactics research explores what instructional strategies yield the best learning outcomes in expert-led explanatory dialogues on complex concepts. In the paper at hand, we address this question by annotating a dataset of dialogues that foster scientific understanding, categorized across five levels of the explainee's knowledge. Our extended dataset, ReWIRED, features span-level annotations for teaching-related explanatory acts. Furthermore, we assess language models of varying sizes on their ability to label teaching acts, uncovering that fine-tuning is necessary for modeling the task, especially GPT-4o-mini with structured prediction profits from that. Finally, we leverage and extend a set of quality metrics for instructional explanations, by involving annotators to estimate the relevance and impact of each metric across the five knowledge levels. We then apply the metrics to our newly annotated dialogues and expand it into a prompt-based framework, enhancing its applicability and scope. Our findings reveal a strong alignment between the quality metrics and the knowledge levels, with expert explanations in our dataset frequently reflecting established best teaching practices.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: free-text/natural language explanations, conversation, conversational modeling, zero/few-shot extraction, dialogue, NLP datasets, educational applications, evaluation

Contribution Types: NLP engineering experiment, Reproduction study, Data resources, Data analysis

Languages Studied: English

Submission Number: 873

Loading