Abstract: Vision-based physiological monitoring is an emerging technology that enables a more convenient access of cardiovascular health status in many medical industrial applications. This article aims to achieve efficient and accurate identification of pulse waveforms by proposing a weighted temporally consistent 3-D (WTC3D) convolution, in which a spatial weight template is incorporated between the spatial and temporal kernels as a constraint for the temporal kernel. WTC3D employs a temporal kernel to keep temporal consistency and a spatial weight template to impose spatial diversity during the remote photoplethysmography (rPPG) feature learning. A WTC3D-based network with a hybrid loss function is designed for pulse prediction. Experiments on three datasets demonstrate the effectiveness of the proposed approach. By considering the temporal propagation characteristics of the pulse signal in the video, WTC3D convolution not only enables efficient pulse feature learning, but also advances the deployment of rPPG networks on source-limited Internet of medical things devices.
External IDs:dblp:journals/tii/ZhaoCHHCL25
Loading