The Impact of Protected Variable Integration in Multimodal Pretraining: A Case Study on ECG Waveforms and ECG Notes Pretraining

Published: 19 Aug 2025, Last Modified: 24 Sept 2025BSN 2025EveryoneRevisionsBibTeXCC BY 4.0
Confirmation: I have read and agree with the IEEE BSN 2025 conference submission's policy on behalf of myself and my co-authors.
Keywords: Electrocardiogram, Contrastive Pretraining, Multimodal Representation Learning, Fairness, Robustness
TL;DR: ECG Pretraining makes downstream models more fair in performance
Abstract: Electrocardiogram (ECG) interpretation using deep learning has shown promising results in detecting cardiac rhythm abnormalities. However, growing evidence suggests that model performance can vary significantly across demographic sub- groups, raising concerns about algorithmic fairness in clinical deployment. In this study, we explore whether incorporating protected variables—specifically age and sex—into multimodal contrastive pretraining can reduce downstream performance disparities. We use a CLIP-style architecture to align ECG signals with machine-generated rhythm descriptions, training two variants: one with text alone and one with demographic augmentation. After pretraining, we evaluate frozen ECG em- beddings using linear probing on a binary classification task distinguishing normal from abnormal rhythms. Our results show that including demographic information during pretraining can reduce performance gaps across age groups and maintains com- parable or improved accuracy across sex. These findings highlight the potential of fairness-aware representation learning to improve subgroup equity in clinical machine learning applications.
Track: 3. Signal processing, machine learning, deep learning, and decision-support algorithms for digital and computational health
Tracked Changes: pdf
NominateReviewer: Sicong Huang, siconghuang@tamu.edu
Submission Number: 49
Loading