Temporal Gated Face Alignment Network for Camera-Based Physiological Sensing

Published: 2025, Last Modified: 27 Jan 2026IEEE Trans. Comput. Soc. Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The remote photoplethysmography (rPPG) technique estimates vital signs, such as heart rate (HR), by analyzing subtle skin color variation in facial videos induced by the pulse. However, it remains a critical challenge to robustly acquire cardiac pulse information in scenarios with head motion (e.g., rotation and swing), as it inevitably introduces interfering noise, such as facial geometric deformation and displacement. Most existing methods primarily focus on how to extract subtle pulse signals while neglecting the detrimental effects of noise, especially out-of-distribution motion patterns. In response to this, we propose a temporal gated face alignment network (TGFAN) to adaptively counteract motion noise in long-term video sequences. Specifically, the bidirectional temporal face alignment (BFA) block first captures interframe motion discrepancies to align the displaced face feature and then extracts motion-robust pulse features. Furthermore, we propose a learnable temporal gating mechanism that disentangles the features and dynamically guides motion-disturbed segments for feature alignment, thereby alleviating local head motion interference. Experimental evaluations on four benchmark datasets demonstrate our superior performance on both intradataset and cross-dataset tests.
Loading