Examining the Vulnerability of Multi-Agent Medical Systems to Human Interventions for Clinical Reasoning

Published: 28 Sept 2025, Last Modified: 18 Oct 2025SEA @ NeurIPS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Human interventions, Multi-agent, clinical reasoning, Multi-turn
TL;DR: Human interventions at vulnerable points in multi-agent medical AI can meaningfully alter the diagnostic trajectory, improving accuracy when correct but destabilizing reasoning and amplifying bias when incorrect.
Abstract: Human interventions at fault points can alter the diagnostic accuracy of multi-agent medical systems. We defined fault points as moments in AI agent conversations, in which an agent's reasoning became most vulnerable to external influence. Using the MedQA dataset, this study analyzed simulated doctor-patient conversations to measure how interventions shifted reasoning and accuracy. Correct intervention methods showed an improvement in baseline diagnostic accuracy of up to 40%, while incorrect or bias-related interventions degraded performance by up to 6% and increased diagnostic drift and uncertainty. Beyond performance changes, our analysis revealed behavioral similarities between cognitive biases in simulated agent environments and real-world clinical practice. Examples included premature closure and susceptibility to misleading cues. Overall, these findings demonstrate that identifying and guiding fault points with human interventions may provide a mechanism for improving diagnostic robustness in multi-agent medical systems.
Archival Option: The authors of this submission want it to appear in the archival proceedings.
Submission Number: 88
Loading