Submission Type: Long
Keywords: Electronic Health Record (EHR), Retrieval-Augmented Generation (RAG), LLM, Question answering
TL;DR: Uniquely use EHR tabular data to find similar patients' clinical notes for LLM medical QA. This EHR-based RAG outperforms traditional text-based RAG, on our high quality and challenging DischargeQA, valuable to mimic real world cases
Abstract: To improve the reliability of Large Language Models (LLMs) in clinical applications, retrieval-augmented generation (RAG) is extensively applied to provide factual medical knowledge. However, beyond general medical knowledge from open-ended datasets, clinical case-based knowledge is also critical for effective medical reasoning, as it provides context grounded in real-world patient experiences.
Motivated by this, we propose Experience Retrieval-Augmentation ExpRAG framework based on Electronic Health Record(EHR), explicitly leveraging prompt optimization through retrieval methods to dynamically construct informative prompts from similar patients' discharge reports.
ExpRAG performs retrieval through a coarse-to-fine process, optimizing the prompt content by efficiently identifying similar patients, followed by extracting task-relevant clinical context. We conduct systematic ablation studies to evaluate different prompt-optimization strategies, including varying the number of retrieved patients, retrieval techniques, and the weighting of retrieved medical contexts
To evaluate RAG systems on EHR data including ExpRAG and medical agents, we introduce DischargeQA, a clinical QA dataset with 1,280 discharge-related questions across diagnosis, medication, and instruction tasks. Each problem is generated using EHR data to ensure realistic and challenging scenarios. Experimental results demonstrate that ExpRAG consistently outperforms a text-based ranker, achieving an average relative improvement of 5.2%, highlighting the importance of case-based knowledge for medical reasoning.
Submission Number: 24
Loading