Self-rPPG: Learning the Optical & Physiological Mechanics of Remote Photoplethysmography with Self-Supervision

Zahid Hasan; Abu Zaher Md Faridee; Masud Ahmed; Nirmalya Roy

Self-rPPG: Learning the Optical & Physiological Mechanics of Remote Photoplethysmography with Self-Supervision

Zahid Hasan, Abu Zaher Md Faridee, Masud Ahmed, Nirmalya Roy

Published: 01 Jan 2022, Last Modified: 06 Feb 2025CHASE 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Remote Photoplethysmography (rPPG) systems provide a contactless, low-cost, ubiquitous mechanism for regular heart rate (HR) monitoring by leveraging the diffused reflection from blood volumetric variations of human skin tissues (i.e. PPG). However, they have achieved limited adoption due to the lack of a generalized methodology to estimate HR from skin videos under various practical scenarios. Traditional supervised approaches require a large amount of synchronized ground truth annotations between video and rPPG signals, which have severely limited end-to-end generalized rPPG model development. In this paper, we propose Self-rPPG, which directly learns the optical and physiological mechanics of rPPG from the unlabeled videos without any synchronized rPPG signal stream annotation. We design a self-supervised contrastive learning-based pretraining strategy to learn the representation of the underlying diffusion signals’ frequency, phase, and the video frames’ temporal coherence from unlabeled video frame sequences collected over multiple public datasets. We run extensive experiments on the optimal contrastive learning schemes (loss functions, sampling strategy), and the saliency of the features learned by Self-rPPG, and show that our self-supervised presentations can successfully encode the diffusion signals’ frequency and phase while demonstrating robustness against temporal corruption. The performance of Self-rPPG is validated on three public datasets where Self-rPPG outperforms the supervised state-of-the-art methods in PPG reconstruction and HR estimation using only 10% of the labeled data.

Loading