Keywords: ECG representation learning, Quantization, Time-series analysis
Abstract: Automated electrocardiogram (ECG) interpretation is crucial for cardiovascular diagnostics, with ECG self-supervised learning (eSSL) emerging as a powerful paradigm to exploit large-scale unlabeled datasets. However, current eSSL frameworks suffer from two key limitations: their learned representations are typically continuous, high-dimensional and opaque, hindering clinical trust; and they are anchored to coarse, human-defined waveform concepts, thus limiting the model's intrinsic capacity to discover novel biomarkers from finer sub-waveform morphologies. To address these issues, we propose **AtomECG**, an eSSL framework that views the ECG as a discrete sequence of fundamental and reusable morphological **atoms** governed by an underlying **grammar**. Specifically, we introduce **Two-Scale Manifold Alignment**, a novel quantization scheme. By simultaneously learning a "grammar manifold" for the entire codebook and employing a geometry-aware alignment for individual patches, AtomECG maps finer sub-waveform morphologies to discrete atoms. Extensive experiments demonstrate that AtomECG not only achieves state-of-the-art performance on a wide range of diagnostic tasks but also provides strong interpretability by explicitly mapping specific atoms to pathological patterns. Furthermore, AtomECG shows potential for long-term monitoring and demonstrates robust generalization across diverse patient populations, underscoring its promise for clinical deployment.
Supplementary Material: zip
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 6599
Loading