TL;DR: This paper proposes GTASR, a one-step real-world super-resolution framework that mitigates consistency drift and geometric decoupling via Trajectory Alignment strategy and Dual-Reference Structural Rectification mechanism.
Abstract: Diffusion-based Real-World Image Super-Resolution (Real-ISR) achieves impressive perceptual quality but suffers from high computational costs due to iterative sampling. While recent distillation approaches leveraging large-scale Text-to-Image (T2I) priors have enabled one-step generation, they are typically hindered by prohibitive parameter counts and the inherent capability bounds imposed by teacher models. As a lightweight alternative, Consistency Models offer efficient inference but struggle with two critical limitations: the accumulation of consistency drift inherent to transitive training, and a phenomenon we term "Geometric Decoupling"— where the generative trajectory achieves pixel-wise alignment yet fails to preserve structural coherence. To address these challenges, we propose GTASR (Geometric Trajectory Alignment Super-Resolution), a {simple yet effective} consistency training paradigm for Real-ISR. Specifically, we introduce a Trajectory Alignment (TA) strategy to rectify the tangent vector field via full-path projection, and a Dual-Reference Structural Rectification (DRSR) mechanism to enforce strict structural constraints. Extensive experiments verify that GTASR delivers superior performance over representative baselines while maintaining minimal latency.
Lay Summary: Many images in real-world scenarios are blurry, compressed, or low in resolution, such as old photos and online images. Image super-resolution aims to turn these low-quality inputs into more natural and visually detailed images. Recent fast methods often rely on large text-to-image models as teachers, but these models are heavy and can limit the flexibility of the restoration system. A lighter alternative is to use consistency models, which are designed to restore an image in a single step, but we find that they can accumulate errors during training and may fail to preserve accurate visual structures such as edges, textures, and object shapes. We address these problems with GTASR, a fast one-step image restoration method that improves how consistency models learn the restoration path. Our method helps the model follow a more reliable restoration direction and adds structural guidance so that the restored image keeps more stable and realistic details. This makes one-step image restoration more practical, allowing GTASR to improve visual quality and structural preservation while keeping the inference cost very low.
Primary Area: Applications->Computer Vision
Keywords: Image Super-Resolution, One step Diffusion, Consistency Model
Originally Submitted PDF: pdf
Submission Number: 8134
Loading