Keywords: Synthetic Data, Portrait Lighting Estimation, Lighting Estimation, Synthetic Portrait Images, Fairness, Demographic Bias
Abstract: We propose a synthetic data-based training framework for real-time deep learning models that predict an omnidirectional high-dynamic-range (HDR) environment light map from a single limited field-of-view, low-dynamic-range portrait image. Training lighting estimation models requires paired data of portrait images and the corresponding environment maps. Previous research generates the data by utilizing relightable real-face datasets collected in specialized light stages, and then relighting faces using HDR environment maps. This process is costly and time-consuming, and consequently, the datasets often cover a limited number of subjects and are prone to demographic bias. On the other hand, recent developments in graphic-based synthetic portrait images based on combining a parametric 3D face model with a comprehensive collection of hand-crafted assets, such as skin, hair, and clothing, have shown great advancement in photorealism. Leveraging the ease of collecting diverse synthetic data, we explore their potential in the domain of portrait lighting estimation. Our training framework involves pre-training on synthetic labeled data and fine-tuning on unlabeled real portrait videos. Our model achieves state-of-the-art performance based on the zero-shot evaluation result on the real portrait image benchmark dataset. Furthermore, we conduct a fairness analysis, showing that our model is more robust to demographic differences than the existing state-of-the-art models.
Supplementary Material: pdf
Submission Number: 14
Loading