3DPE-Gaze:Unlocking the Potential of 3D Facial Priors for Generalized Gaze Estimation

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Gaze Estimation, Cross-Domain Adaptation, 3D Facial Priors
Abstract: In recent years, face-based deep-learning gaze estimation methods have achieved significant advancements. However, while face images provide supplementary information beneficial for gaze inference, the substantial extraneous information they contain also increases the risk of overfitting during model training and compromises generalization capability. To alleviate this problem, we propose the 3DPE-Gaze framework, explicitly modeling 3D facial priors for feature decoupling and generalized gaze estimation. The 3DPE-Gaze framework consists of two core modules: the 3D Geometric Prior Module (3DGP) incorporating the FLAME model to parameterize facial structures and gaze-irrelevant facial appearances while extracting gaze features; the Semantic Concept Alignment Module (SCAM) separates gaze-related and unrelated concepts through CLIP-guided contrastive learning. Finally, the 3DPE-Gaze framework combines 3D facial landmark as prior for generalized gaze estimation. Experimental results show that 3DPE-Gaze outperforms existing state-of-the-art methods on four major cross-domain tasks, with particularly outstanding performance in challenging scenarios such as lighting variations, extreme head poses, and glasses occlusion.
Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)
Submission Number: 3909
Loading