Diffusion Posterior Proximal Sampling for Image Restoration

Published: 20 Jul 2024, Last Modified: 06 Aug 2024MM2024 OralEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Diffusion models have demonstrated remarkable efficacy in generating high-quality samples. Existing diffusion-based image restoration algorithms exploit pre-trained diffusion models to leverage data priors, yet they still preserve elements inherited from the unconditional generation paradigm. These strategies initiate the denoising process with pure white noise and incorporate random noise at each generative step, leading to over-smoothed results. In this paper, we present a refined paradigm for diffusion-based image restoration. Specifically, we opt for a sample consistent with the measurement identity at each generative step, exploiting the sampling selection as an avenue for output stability and enhancement. The number of candidate samples used for selection is adaptively determined based on the signal-to-noise ratio of the timestep. Additionally, we start the restoration process with an initialization combined with the measurement signal, providing supplementary information to better align the generative process. Extensive experimental results and analyses validate that our proposed method significantly enhances image restoration performance while consuming negligible additional computational resources.
Primary Subject Area: [Generation] Generative Multimedia
Relevance To Conference: This work contributes to multimedia/multimodal processing by proposing a refined approach for diffusion-based image restoration, which is a crucial task in multimedia applications. By exploiting sample consistency and incorporating measurement signals during the denoising process, the proposed method can generate sharper and more accurate restored images compared to existing diffusion-based techniques. This advancement in image restoration can significantly enhance various multimedia applications that involve image processing, such as image editing, image enhancement, and image-based rendering. Additionally, the improved restoration quality can potentially benefit multimodal tasks that combine image data with other modalities, such as text-to-image synthesis or video restoration.
Supplementary Material: zip
Submission Number: 4657
Loading