HeadsUp! High-Fidelity Portrait Image Super-Resolution

17 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Portrait Image Super-Resolution, Diffusion Models
Abstract: Portrait pictures, which typically feature both human subjects and natural backgrounds, are one of the most prevalent forms of photography on social media. Existing image super-resolution (ISR) techniques generally focus either on generic real-world images or strictly aligned facial images (i.e., face super-resolution). In practice, separate models are blended to handle portrait photos: the face specialist model handles the face region, and the general model processes the rest. However, these blending approaches inevitably introduce blending or boundary artifacts around the facial regions due to different model training recipes, while human perception is particularly sensitive to facial fidelity. To overcome these limitations, we study the portrait image supersolution (PortraitISR) problem, and propose HeadsUp, a single-step diffusion model that is capable of seamlessly restoring and upscaling portrait images in an end-to-end manner. Specifically, we build our model on top of a single-step diffusion model and develop a face supervision mechanism to guide the model in focusing on the facial region. We then integrate a reference-based mechanism to help with identity restoration, reducing face ambiguity in low-quality face restoration. Additionally, we have built a high-quality 4K portrait image ISR dataset dubbed PortraitSR-4K, to support model training and benchmarking for portrait images. Extensive experiments show that HeadsUp achieves state-of-the-art performance on the PortraitISR task while maintaining comparable or higher performance on both general image and aligned face datasets.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 8373
Loading