Hierarchical Multi-Agent Orchestration for Interpretable Educational Dialogue: Emergent Reasoning in LLM-Powered Tutoring Systems
Keywords: Multi-agent systems, NLP, Educational dialogue, Large language models, Intelligent tutoring systems, Hierarchical architecture, Agent orchestration, Interpretability, Adaptive learning, Reasoning systems, LLM-powered agents
Abstract: Large Language Models (LLMs) have enabled conversational tutoring systems, yet they struggle with maintaining logical consistency, providing specialized domain expertise, and offering interpretable reasoning across multi-turn educational dialogues. We present a hierarchical multi-agent architecture that decomposes educational dialogue into coordinated interactions among specialized agents organized across four tiers: perception, domain expertise, coordination, and strategic planning. Unlike monolithic approaches where a single model attempts all educational functions simultaneously, our framework treats adaptive tutoring as an emergent property of agent collaboration with explicit consistency constraints and interpretable decision-making. Using on a learning platform serving users and evaluation on established reasoning benchmarks, we demonstrate that agent orchestration achieves substantial improvements: higher accuracy on deductive reasoning tasks, reduction in temporal inconsistencies across dialogue sessions, and deeper multi-turn reasoning chains while maintaining interpretability through agent specialization. We provide architectural principles for decomposing complex dialogue tasks, coordination protocols ensuring consistency, and analysis of emergent behaviors including cross-domain collaboration and adaptive scaffolding. Our work contributes to hierarchical agent orchestration addressing fundamental limitations in educational dialogue systems and theoretical insights into interpretability through architectural decomposition.
Paper Type: Long
Research Area: AI/LLM Agents
Research Area Keywords: Agents, LLMs, Dialogue and Interactive Systems, Question Answering, NLP Applications, Explainablity and Interpretability
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data analysis
Languages Studied: English-language benchmarks (ProofWriter, MATH, LogiQA)
Submission Number: 7932
Loading