Keywords: Medical Triage, AI Decision Support, Multimodal Learning, Human-AI Collaboration, Cognitive Load, Patients priority
Abstract: In disaster situations, rapid and accurate decisionmaking is critical for patient outcomes. The improper application of traditional triage protocols like START and JUMPSTART often results in decision errors such as over-triage and under-triage. In this project, we built an AI decision support system that assists in different emergency triage training scenarios designed to improve accuracy, speed, and reliability. Our system analyzes video, audio, and text from virtual emergency simulations and classifies patients into urgency levels (red, yellow, green), depending on who should be treated first in real time, using GPT-4o with vision, real-time audio, and Retrieval-Augmented Generation (RAG) to follow standard protocols. To evaluate performance, we tested different OpenAI models based on accuracy $A$, average response time $T$, and confidence score $C$, in the context of the same medical-triage simulation data. We found that the o4-mini model consistently gave the best accuracy A=0.667 with an average response time of T≈4.02 s, while gpt-4.1-nano was the fastest with T=0.94 s, and gpt-4o maintained the highest confidence score 𝐶=0.836.This highlights the trade-offs between speed ($T$), accuracy ($A$), and confidence ($C$) when using AI for medical triage training. These are just the initial results, and as the research progresses with more training data and simulation runs, we expect these metrics to change and improve, leading to more comprehensive evaluations. We also evaluate how AI-human collaboration impacts accuracy, decision speed, and cognitive load, including performance when AI assistance is withdrawn. By doing this, we can better understand how users interact with AI, identify potential risks, and learn how to improve both training and real-world triage performance.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 24213
Loading