Keywords: ECG generation, diffusion models, structured conditioning, demographic fidelity, synthetic medical data
TL;DR: DCDM-ECG: a 10M-parameter latent-diffusion ECG generator with demographic conditioning; matches real PTB-XL demographics within σ and beats SSSD-ECG on TSTR by 4.5 points.
Abstract: Conditional 12-lead ECG generators are increasingly used as drop-in synthetic cohorts, but they quietly distort patient demographics. The main idea of this paper is that demographics should enter the generator through a structured numeric channel rather than only through diagnostic labels. We introduce DCDM-ECG, a 10.1M-parameter conditional latent diffusion model whose 76-dimensional condition concatenates 71 multi-hot SCP codes with five z-normalised numeric demographic axes (age, sex, height, weight, heart rate). On PTB-XL, DCDM-ECG matches the real age distribution within σ, follows specified heart rate to within ~3 bpm at the per-sample level, and reaches TSTR macro-AUROC 0.885±0.012 (5 seeds, n=8000), 4.5 percentage points above the strongest reported label-only baseline. Holding the same architecture and training recipe but removing the demographic axes from the conditioning vector drops macro TSTR from 0.94 to 0.60 at the matched n=1500 diagnostic protocol used for the ablation, isolating the contribution of structured demographic conditioning.
Submission Number: 128
Loading