Keywords: Artificial Intelligence, Healthcare Delivery, Conversational AI, Medical Agent
TL;DR: In the first large-scale study examining conversational medical AI in real-world conditions, we show improved patient satisfaction and sustained medical accuracy.
Abstract: The shortage of doctors is creating a critical squeeze in access to medical expertise. While conversa-
tional Artificial Intelligence (AI) holds promise in addressing this problem, its safe deployment in
patient-facing roles remains largely unexplored in real-world medical settings. We present the first
large-scale evaluation of a physician-supervised LLM-based conversational agent in a real-world
medical setting.
Our agent, Mo, was integrated into an existing medical advice chat service. Over a three-week period,
we conducted a randomized controlled experiment with 926 cases to evaluate patient experience
and satisfaction. Among these, Mo handled 298 complete patient interactions, for which we report
physician-assessed measures of safety and medical accuracy.
Patients reported higher clarity of information (3.73 vs 3.62 out of 4, p < 0.05) and overall satis-
faction (4.58 vs 4.42 out of 5, p < 0.05) with AI-assisted conversations compared to standard care,
while showing equivalent levels of trust and perceived empathy. The high opt-in rate (81% among
respondents) exceeded previous benchmarks for AI acceptance in healthcare. Physician oversight
ensured safety, with 95% of conversations rated as “good” or “excellent” by general practitioners
experienced in operating a medical advice chat service.
Our findings demonstrate that carefully implemented AI medical assistants can enhance patient
experience while maintaining safety standards through physician supervision. This work provides
empirical evidence for the feasibility of AI deployment in healthcare communication and insights
into the requirements for successful integration into existing healthcare services.
Submission Number: 39
Loading