Comparative Evaluation of Clinical Large Language Models and Machine Learning to Predict Antimicrobial Resistance in Hospital-Onset Sepsis

Scott A. Cohen, Ziyi Chen, Jiang Bian, Christina Boucher, Yonghui Wu, Mattia Prosperi

Published: 01 Jan 2025, Last Modified: 08 Jul 2025AIME (1) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Approaches to guide empiric antimicrobial therapy are needed, especially in critically ill populations with prevalent antimicrobial resistance (AMR). While artificial intelligence shows promise in predicting AMR, scalable and generalizable prediction models are essential for broad clinical adoption. We utilized a publicly available clinical large language model (LLM), Gatortron, in comparison to traditional machine learning, to predict AMR and methicillin-resistant Staphylococcus aureus (MRSA)-specific patterns within a hospital-onset sepsis cohort using electronic health record (EHR) data available at time of illness onset. EHR data from approximately 150,000 hospitalizations with a documented bacterial infection at a large tertiary care healthcare system between 2010 and 2023 were examined. Among 2,019 eligible hospital-onset sepsis encounters, an AMR pathogen was identified in 911 (45%) and MRSA was isolated in 234 (26%). LLMs outperformed traditional models in predicting MRSA, achieving an AUC of 0.73 compared to 0.66 for the best traditional ML model, with superior F1 scores (0.43 vs. 0.16 for ML). Negative predictive value for MRSA prediction using LLM was at least 90% across majority of infection presentations. The LLM's superior prediction using a relatively simplified feature set demonstrates the potential of leveraging EHR data for early resistance prediction, though further refinement is needed to enhance sensitivity and clinical applicability.