Two-Stage Contrastive Language Electrocardiogram Pre-training for Fine-Grained Waveform Features

Published: 09 Jun 2025, Last Modified: 09 Jun 2025FMSD @ ICML 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: ECG, report, multi-modal contrastive pre-training
Abstract: Electrocardiograms (ECGs) play a vital role in diagnosing cardiovascular diseases. While recent ECG-text contrastive learning methods have shown promise, they often overlook the incomplete nature of clinical reports. Typically, a report is generated by identifying key waveform features and then deriving a diagnosis, yet these intermediate features are rarely documented. This gap limits the model’s ability to capture waveform patterns and understand the underlying diagnostic reasoning. To address this, we propose FG-CLEP (Fine-Grained Contrastive Language ECG Pre-training), which leverages large language models (LLMs) to recover the missing waveform features from incomplete reports. To further improve performance, we introduce a semantic similarity matrix to reduce false negatives caused by the prevalence of common diagnoses and adopt a sigmoid-based loss function to handle the multi-label nature of ECG tasks. Experiments on six datasets show that FG-CLEP consistently outperforms state-of-the-art methods in both zero-shot prediction and linear probing.
Submission Number: 13
Loading