Keywords: Face Restoration, Generative Models, Latent Space Models
Abstract: The Perception-Distortion tradeoff (PD-tradeoff) theory suggests that face restoration algorithms must balance perceptual quality and fidelity. To achieve minimal distortion while maintaining perfect perceptual quality, Posterior-Mean Rectified Flow (PMRF) proposes a flow based approach where source distribution is minimum distortion estimations. Although PMRF is shown to be effective, its pixel-space modeling approach limits its ability to align with human perception, where human perception is defined as how humans distinguish between two image distributions. In this work, we propose Latent-PMRF, which reformulates PMRF in the latent space of a variational autoencoder (VAE), facilitating better alignment with human perception during optimization. By defining the source distribution on latent representations of minimum distortion estimation, we bound the minimum distortion by the VAE's reconstruction error. Moreover, we reveal the design of VAE is crucial, and our proposed VAE significantly outperforms existing VAEs in both reconstruction and restoration. Extensive experiments on blind face restoration demonstrate the superiority of Latent-PMRF, offering an improved PD-tradeoff compared to existing methods, along with remarkable convergence efficiency, achieving a 5.79X speedup over PMRF in terms of FID. Our code will be available as open-source.
Primary Area: generative models
Submission Number: 6524
Loading