Exploring multi-site dataset shifts in electronic health records using time series features

Published: 23 Sept 2025, Last Modified: 18 Oct 2025TS4H NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: electronic health record, dataset shift, multisite, time series features
TL;DR: We elucidate differences in electronic health records across institutions from a dataset shift perspective, shedding light on the need for careful harmonization or deployment considerations when developing and using models with such data.
Abstract: Models developed using longitudinal electronic health record (EHR) data can demonstrate inconsistent abilities to generalize to new data at different institutions. Rather than relying only only external validity of performance, we consider how distributional shifts in EHR data can inform multi-site generalizability without the need for task-specific models or annotations. Extending statistical dataset shift detection to time series through feature-based temporal analysis, we compare the EHR data from five different institutions and four different prior patient conditions for patients requiring the administration of an inpatient diuretic. We illustrate which sites exhibit greater variability as well as the EHR measures contributing to the variation, providing valuable insight into downstream deployment.
Submission Number: 53
Loading