Volatility-Aware Masking Improves Performance and Efficiency of Pretrained EHR Foundation Models

Rajna Fani; Rafi Al Attrach; Yugang jia; David Restrepo; Peter Schueffler; Leo Anthony Celi

Volatility-Aware Masking Improves Performance and Efficiency of Pretrained EHR Foundation Models

Rajna Fani, Rafi Al Attrach, Yugang jia, David Restrepo, Peter Schueffler, Leo Anthony Celi

Published: 23 Sept 2025, Last Modified: 18 Oct 2025TS4H NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Masked Autoencoders, Self-supervised Learning, Representation Learning, Masking Strategies, Electronic Health Records (EHR), Clinical Time Series, Volatility-Aware Masking

TL;DR: We show that leveraging clinical volatility through CV-Masking creates better EHR foundation models, with faster convergence and stronger predictive performance.

Abstract: Masked autoencoder (MAE) models are increasingly applied to electronic health records (EHR) as a pre-training method to learn general-purpose representations that support diverse downstream clinical tasks. However, existing approaches typically rely on uniform random masking, implicitly assuming that all clinical features are equally predictable. In practice, laboratory tests exhibit substantial heterogeneity in temporal volatility: certain biomarkers (e.g., sodium) remain relatively stable, whereas others (e.g., lactate) fluctuate considerably and are more challenging to model. To address this limitation, we propose Volatility-Aware Masking strategy (CV-Masking), a pretraining strategy that adaptively adjusts masking probabilities according to the intrinsic variability of each feature. Our experiments on a large panel of laboratory tests demonstrate that CV-Masking consistently outperforms both random and variance-based masking strategies, yielding improved downstream predictive performance and faster convergence.

Submission Number: 75

Loading