Keywords: Diffusion Models, Residual Modeling, Few-step Inference, Super-Resolution, Denoising
TL;DR: RDIMs: a few-step residual-based diffusion framework enabling faithful image reconstruction.
Abstract: Diffusion models have recently achieved state-of-the-art results in image generation and reconstruction, yet their initialization on pure Gaussian noise makes them poorly aligned with inverse problems such as denoising and super-resolution. This mismatch leads to inefficiency, often requiring hundreds of sampling steps, and induces hallucinations that drift away the reconstruction from the ground truth. To overcome these challenges, residual diffusion implicit models (RDIMs) are proposed, constituting a generalized framework that explicitly models the residuals between high-quality (HQ) and low-quality (LQ) images. RDIMs align the forward process with the actual degradation, enabling reconstructions that are faster and more accurate. Inspired by implicit sampling, the reverse process can skip intermediate timesteps, allowing for few-step or even single-step reconstructions while mitigating the hallucinations inherent to long diffusion chains. Furthermore, RDIMs introduce a controllable variance mechanism that interpolates between deterministic and stochastic sampling, balancing fidelity and diversity depending on degradation severity. Experiments on denoising and super-resolution benchmarks demonstrate that RDIMs consistently outperform conventional DDPMs and match or surpass ResShift, while reducing the number of sampling steps by up to $100\times$. The results position RDIMs as an efficient solution for a broad range of image restoration tasks.
Primary Area: generative models
Submission Number: 19115
Loading