Retrieval-Augmented Foundation Model Enhances Risk Prediction Using Electronic Health Records

Published: 28 May 2026, Last Modified: 11 Jun 2026ICML 2026 FM4LS Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Retrieval, EHR, Risk, Prediction, RAG
TL;DR: Retrieval Augmented EHR Foundation Model
Abstract: Electronic Health Records (EHR) contain rich longitudinal patient information widely used for predictive modeling. However, effectively leveraging historical data remains challenging due to long trajectories, event heterogeneity, temporal irregularity, and varying relevance of past visits. Existing approaches rely on fixed windows or uniform aggregation, which may obscure clinically important signals. We introduce $\texttt{EHR-RAGp}$, a retrieval-augmented foundation model that dynamically integrates relevant patient history. We construct an EHR vector database via clinically relevant chunking strategies and employ a prototype-guided retrieval module to identify and weight the most relevant historical segments for a given prediction task. Across multiple tasks, $\texttt{EHR-RAGp}$ consistently outperforms state-of-the-art EHR foundation models.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 53
Loading