Keywords: Blind Face Restoration, One-Step Diffusion, Knowledge Distillation
Paper Track: Long Paper (archival)
Abstract: Blind face restoration inherently struggles with the perception-distortion trade-off. Regression-based methods minimize distortion but inevitably produce over-smoothed, blurry textures, while generative models synthesize sharp details but suffer from prohibitive computational latency or structural identity shifts. Recent flow-matching paradigms elegantly reconcile this conflict by explicitly transporting the distortion-optimal posterior mean to the natural image manifold; however, they still rely on computationally expensive ODE solvers. In this work, we propose DirectFlow to achieve this balance in a single forward pass. Rather than naively matching ground-truth images, DirectFlow distills the transported posterior mean of a Rectified Flow teacher directly into a latent consistency model. We repurpose the teacher's velocity field as a distributional critic, pulling single-step predictions towards true data manifold peaks without solver integration error. Furthermore, to dynamically adapt to diverse degradations, we introduce a semantic adapter paired with Low-Rank Adaptation (LoRA) in the UNet's cross-attention layers. By a two-stage optimization schedule that first aligns conditioning modules and then freezes spatial pathways during LoRA refinement, this strategy enables continuous, degradation-aware conditioning without compromising the foundational generative prior. Extensive experiments demonstrate that one-step DirectFlow matches the perceptual quality of a 25-step flow-matching teacher while accelerating inference by 7.5x, establishing a highly efficient state-of-the-art architecture for real-time, high-fidelity blind face restoration.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 19
Loading