Keywords: Foundation model, Electrocardiograms analysis, Contrastive Learning, Self-Supervised Learning, Time Series
TL;DR: We propose CLEF, a foundation model for electrocardiograms (ECGs) analysis guided by future health risk.
Abstract: The electrocardiogram (ECG) is a key diagnostic tool in cardiovascular health. Single-lead ECG recording is integrated into both clinical-grade and consumer wearables. While self-supervised pretraining of foundation models on unlabeled ECGs improves diagnostic performance, existing approaches do not incorporate domain knowledge from clinical metadata. We introduce a novel contrastive approach that utilizes an established clinical risk score to adaptively weight negative pairs: clinically-guided contrastive learning. It aligns the similarities of ECG embeddings to clinically meaningful differences between subjects, with an explicit mechanism to handle missing metadata. Using 12-lead ECGs from 161K patients in MIMIC-IV dataset, we pretrain single-lead ECG foundation models at three scales, collectively called CLEF, using only routinely-collected metadata without requiring per-sample ECG annotations. We evaluate CLEF on $18$ clinical classification and regression tasks across $7$ held-out datasets, and benchmark against $5$ foundation model baselines and $3$ self-supervised algorithms. When pretrained on 12-lead ECG data and tested on lead-I data, CLEF outperforms self-supervised foundation model baselines: the medium-sized CLEF achieves average AUROC improvements of at least $2.6\\%$ in classification and average reductions in MAEs of at least $3.2\\%$ in regression. Comparing with existing self-supervised learning algorithms, CLEF improves the average AUROC by at least $1.8\\%$. Moreover, when pretrained only on lead-I data for classification tasks, CLEF performs comparably to the state-of-the-art ECGFounder, which has been trained in a supervised manner. Overall, CLEF allows more accurate and scalable single-lead ECG analysis, advancing remote health monitoring. We will publish our code and pretrained CLEF models.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 9018
Loading