Personalized Case- and Evidence-Based TBI Prognosis with Small Language Models

Published: 19 Aug 2025, Last Modified: 12 Oct 2025BHI 2025EveryoneRevisionsBibTeXCC BY 4.0
Confirmation: I have read and agree with the IEEE BHI 2025 conference submission's policy on behalf of myself and my co-authors.
Keywords: Small Language Models, Case-based Reasoning, Evidence-based Practice, Traumatic Brain Injury, ED Disposition
Abstract: Timely and accurate emergency department disposition for traumatic brain injury patients requires rapid synthesis of complex, multimodal data. Yet in practice, such decisions often rely on heuristics, resulting in variable outcomes. While large language models show promise for supporting evidence-based practice, their clinical deployment is limited by size, cost, and privacy concerns. We present a dual retrieval-augmented framework that leverages efficient, on-premise small language models and unifies evidence-based practice with case-based reasoning to enable personalized disposition prediction of patients with traumatic brain injury. Evidence-based practice is modeled by retrieving guideline passages tailored to each patient's presentation, while case-based reasoning retrieves similar patients as few-shot exemplars. This dual-retrieval strategy personalizes both clinical guidelines and case-based exemplars, enabling the language model to produce predictions that integrate guideline alignment with patient-specific context. We implemented this framework using two open-source language models under 4B parameters—Phi-4-mini and Qwen-2.5. Across both models, similar patient exemplars consistently improved classification performance, increasing sensitivity without sacrificing specificity. Clinical guidelines had less impact on performance, but when combined with exemplars, they shifted predictions toward more conservative, guideline-consistent behavior. Clinician evaluations suggest that while adding similar patient exemplars enhances accuracy, overreliance on exemplars may diminish reasoning quality, whereas guidelines improve the clinical relevance and justification of model outputs. These findings underscore how targeted retrieval can personalize both predictions and their rationale, enhancing the performance, interpretability, and trustworthiness of AI-assisted clinical decision-making.
Track: 2. Bioinformatics
Registration Id: 42NPRFRK2L7
Submission Number: 111
Loading