Multi-Turn LLM Systems for Diagnostic Decision-Making: Considerations, Biases, and Challenges

Published: 06 Oct 2025, Last Modified: 04 Nov 2025MTI-LLM @ NeurIPS 2025 PosterEveryoneRevisionsBibTeXCC BY-ND 4.0
Keywords: Multi-Agent System, Clinical Decision Making, Agentic AI, LLM, Agent Interaction
Abstract: This study investigates the systemic limitations and architectural design trade-offs of Large Language Model multi-agent systems (LLM-MAS) for clinical decision support, focusing on how agent collaboration and architectural choices influence reasoning in complex medical problems. We examined the effects of changes in agent roles, interaction protocols, and architecture on diagnostic accuracy and reasoning through targeted ablation studies with the AgentClinic framework. Reflecting the time-sensitive and uncertain nature of clinical practice, these experiments evaluate system performance under conditions of limited information, constrained interaction depth, variable access to expertise, and the potential amplification of emergent biases. Multi-turn agent interactions also demonstrate systematic emergent biases across demographic categories highlighting how such interactions can contribute to fairness concerns in clinical decision support. The results reveal meaningful variation across configurations, showing how collaboration strategies and information richness impact multi-turn diagnostic reasoning. This work provides a detailed view of the vulnerabilities and strengths of LLM-MAS, supporting future efforts to develop robust and clinically effective decision support systems.
Submission Number: 113
Loading