Unveiling Hidden Details: A RAW Data-Enhanced Paradigm for Real-World Super-Resolution

06 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Real-World Super-Resolution, RAW Data, Generalization
TL;DR: This paper introduces RealSR-RAW, a dataset with over 10,000 paired LR and HR RGB images and corresponding LR RAW data, and pioneers a novel RAW adapter using RAW data to enhance Real SR model performance.
Abstract: Real-world image super-resolution (Real SR) aims to generate high-fidelity, detail-rich high-resolution (HR) images from low-resolution (LR) counterparts. Existing Real SR methods primarily focus on generating details from the LR RGB domain, often leading to a lack of richness or fidelity in fine details. In this paper, we pioneer the use of details hidden in RAW data to complement existing RGB-only methods, yielding superior outputs. We argue that key image processing steps in Image Signal Processing, such as denoising and demosaicing, inherently result in the loss of fine details in LR images, making LR RAW a valuable information source. To validate this, we present RealSR-RAW, a comprehensive dataset comprising over 10,000 pairs with LR and HR RGB images, along with corresponding LR RAW, captured across multiple smartphones under varying focal lengths and diverse scenes. Additionally, we propose a simple yet efficient and general RAW adapter to effectively integrate LR RAW data into existing CNNs, Transformers, and Diffusion-based Real SR models by extracting fine-grained details from RAW data to enhance performance. Extensive experiments demonstrate that incorporating RAW data significantly enhances detail recovery and improves Real SR performance across ten evaluation metrics, including both fidelity and perception-oriented metrics, under real-world and wild-captured scenarios. Our findings open a new direction for the Real SR task, with the dataset and code being made available to support future research.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 2626
Loading