ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration

Chi-Wei Hsiao; Yu-Lun Liu; Cheng-Kun Yang; Sheng-Po Kuo; Kevin Jou; Chia-Ping Chen

ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration

Chi-Wei Hsiao, Yu-Lun Liu, Cheng-Kun Yang, Sheng-Po Kuo, Kevin Jou, Chia-Ping Chen

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0

Keywords: Blind face restoration, Diffusion models, Image restoration, Reference-based, Latent diffusion model

Abstract: While recent works on blind face image restoration have successfully produced impressive high-quality (HQ) images with abundant details from low-quality (LQ) input images, the generated content may not accurately reflect the real appearance of a person. To address this problem, incorporating well-shot personal images as additional reference inputs may be a promising strategy. Inspired by the recent success of the Latent Diffusion Model (LDM) in image generation, we propose ReF-LDM—an adaptation of LDM designed to generate HQ face images conditioned on one LQ image and multiple HQ reference images. Our LDM-based model incorporates an effective and efficient mechanism, CacheKV, for conditioning on reference images. Additionally, we design a timestep-scaled identity loss, enabling LDM to focus on learning the discriminating features of human faces. Lastly, we construct FFHQ-ref, a dataset consisting of 20,406 high-quality (HQ) face images with corresponding reference images, which can serve as both training and evaluation data for reference-based face restoration models.

Primary Area: Diffusion based models

Flagged For Ethics Review: true

Submission Number: 4914

Loading