The Impact of Protected Variable Integration in Multimodal Pretraining: A Case Study on ECG Waveforms and ECG Notes Pretraining
Confirmation: I have read and agree with the IEEE BSN 2025 conference submission's policy on behalf of myself and my co-authors.
Keywords: Electrocardiogram, Contrastive Pretraining, Multimodal Representation Learning, Fairness, Robustness
TL;DR: ECG Pretraining makes downstream models more fair in performance
Abstract: Electrocardiogram (ECG) interpretation using deep
learning has shown promising results in detecting cardiac rhythm
abnormalities. However, growing evidence suggests that model
performance can vary significantly across demographic sub-
groups, raising concerns about algorithmic fairness in clinical
deployment. In this study, we explore whether incorporating
protected variables—specifically age and sex—into multimodal
contrastive pretraining can reduce downstream performance
disparities. We use a CLIP-style architecture to align ECG
signals with machine-generated rhythm descriptions, training
two variants: one with text alone and one with demographic
augmentation. After pretraining, we evaluate frozen ECG em-
beddings using linear probing on a binary classification task
distinguishing normal from abnormal rhythms. Our results show
that including demographic information during pretraining can
reduce performance gaps across age groups and maintains com-
parable or improved accuracy across sex. These findings highlight
the potential of fairness-aware representation learning to improve
subgroup equity in clinical machine learning applications.
Track: 3. Signal processing, machine learning, deep learning, and decision-support algorithms for digital and computational health
Tracked Changes: pdf
NominateReviewer: Sicong Huang, siconghuang@tamu.edu
Submission Number: 49
Loading