Abstract: Didactics research explores what instructional strategies yield the best learning outcomes in expert-led explanatory dialogues on complex concepts. In the paper at hand, we address this question by annotating a dataset of dialogues that foster scientific understanding, categorized across five levels of the explainee's knowledge. Our extended dataset, ReWIRED, features span-level annotations for teaching-related explanatory acts. Furthermore, we assess language models of varying sizes on their ability to label teaching acts, uncovering that fine-tuning is necessary for modeling the task, especially GPT-4o-mini with structured prediction profits from that. Finally, we leverage and extend a set of quality metrics for instructional explanations, by involving annotators to estimate the relevance and impact of each metric across the five knowledge levels. We then apply the metrics to our newly annotated dialogues and expand it into a prompt-based framework, enhancing its applicability and scope. Our findings reveal a strong alignment between the quality metrics and the knowledge levels, with expert explanations in our dataset frequently reflecting established best teaching practices.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: free-text/natural language explanations, conversation, conversational modeling, zero/few-shot extraction, dialogue, NLP datasets, educational applications, evaluation
Contribution Types: NLP engineering experiment, Reproduction study, Data resources, Data analysis
Languages Studied: English
Submission Number: 873
Loading