Towards Characterizing Knowledge Distillation of PPG Heart Rate Estimation Models

Published: 23 Sept 2025, Last Modified: 18 Oct 2025TS4H NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: PPG, heart rate, knowledge distillation, scaling law
TL;DR: We explore and characterize how a large PPG heart rate estimation model might be distilled into a smaller model appropriate for running on-device, in real-time, on wearable devices.
Abstract: Heart rate estimation from photoplethysmography (PPG) signals generated by wearable devices such as smartwatches and fitness trackers has significant implications for the health and well-being of individuals. Neural models designed to estimate heart rate are largely deployed on wearable devices and thus must adhere to strict memory and latency constraints. In this workshop submission, we explore and characterize how large pre-trained PPG models may be distilled to smaller models appropriate for real-time inference on the edge. We evaluate four distillation strategies: hard distillation, soft distillation, decoupled knowledge distillation (DKD), and feature distillation through comprehensive sweeps of teacher and student model capacities. We present a characterization of the resulting scaling laws describing the relationship between model size and performance. This early investigation lays the groundwork for practical and predictable methods for building edge-deployable models for physiological sensing.
Submission Number: 19
Loading