Keywords: Health, Foundation models, Knowledge distillation, Unsupervised learning, Self-supervised learning, Biosignals, Wearable devices
TL;DR: A representational knowledge distillation framework from high-fidelity to low-fidelity biosignals for improved performance and model compression.
Abstract: Modern wearable devices can conveniently and continuously record various biosignals in the many different environments of daily living, ultimately enabling a rich view of individual health.
However, not all biosignals are the same: high-fidelity measurements, such as photoplethysmography (PPG), contain more physiological information, but require optical sensors with a high power footprint. In a resource-constrained setting, such high-fidelity biosignals may be unavailable. Alternatively, a lower-fidelity biosignal, such as those from an accelerometer, has a significantly smaller power footprint and is available in almost any wearable device. While multi-modal modeling and cross-modal reconstruction of biosignals have been done before, here, we demonstrate that we can distill representational knowledge across biosignals with different levels of fidelity, i.e., from PPG to accelerometer, using 20 million minutes of unlabeled data collected from ~172K participants in the Apple Heart and Movement Study under informed consent. Our knowledge distillation framework does not require labels; we pre-train PPG encoders via self-supervised learning, and then distill the representational knowledge from the PPG encoders to accelerometer encoders. We first demonstrate strong cross-modal alignment on unseen data, e.g., 99.2% top-1 accuracy for retrieving PPG embeddings from accelerometer embeddings. We show that distilled accelerometer encoders have significantly more informative representations compared to self-supervised or supervised encoders trained on accelerometer data for downstream targets, observed by at least 23%-49% improved performance for predicting heart rate and heart rate variability, and are readily predictive of a wide array of downstream targets including demographic variables, health conditions, use of medications, and lifestyle habits. We also demonstrate that our framework can be applied to different encoder architectures with different pre-training strategies of the strong encoder, and can be used to simultaneously do cross-modality distillation and model compression. Additionally, we perform various ablations for augmentations, hyperparameters and multi-modal training. We believe our proposed representational knowledge distillation framework may unlock new opportunities for developing digital biomarkers from any wearable device with lower-fidelity biosignals, and help individuals track their health more frequently and conveniently.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5097
Loading