Diffusion-based Extreme Image Compression with Compressed Feature Initialization

Zhiyuan Li; Yanhui Zhou; Hao Wei; Chenyang Ge; Ajmal Saeed Mian

Diffusion-based Extreme Image Compression with Compressed Feature Initialization

Zhiyuan Li, Yanhui Zhou, Hao Wei, Chenyang Ge, Ajmal Saeed Mian

23 Sept 2024 (modified: 25 Jan 2025)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: extreme image compression, diffusion models, compressed feature initialization, residual diffusion

TL;DR: This paper presents an efficient diffusion-based extreme image compression model that significantly reduces the number of denoising steps required for reconstruction.

Abstract: Diffusion-based extreme image compression methods have achieved impressive performance at extremely low bitrates. However, constrained by the iterative denoising process that starts from pure noise, these methods are limited in both fidelity and efficiency. To address these two issues, we present $\textbf{R}$elay $\textbf{R}$esidual $\textbf{D}$iffusion $\textbf{E}$xtreme $\textbf{I}$mage $\textbf{C}$ompression ($\textbf{RDEIC}$), which leverages compressed feature initialization and residual diffusion. Specifically, we first use the compressed latent features of the image with added noise, instead of pure noise, as the starting point to eliminate the unnecessary initial stages of the denoising process. Second, we design a novel relay residual diffusion that reconstructs the raw image by iteratively removing the added noise and the residual between the compressed and target latent features. Notably, our relay residual diffusion network seamlessly integrates pre-trained stable diffusion to leverage its robust generative capability for high-quality reconstruction. Third, we propose a fixed-step fine-tuning strategy to eliminate the discrepancy between the training and inference phases, further improving the reconstruction quality. Extensive experiments demonstrate that the proposed RDEIC achieves state-of-the-art visual quality and outperforms existing diffusion-based extreme image compression methods in both fidelity and efficiency. The source code and pre-trained models will be released.

Supplementary Material: zip

Primary Area: generative models

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 2931

Loading