QNCD: Quantization Noise Correction for Diffusion Models

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Diffusion models have revolutionized image synthesis, setting new benchmarks in quality and creativity. However, their widespread adoption is hindered by the intensive computation required during the iterative denoising process. Post-training quantization (PTQ) presents a solution to accelerate sampling, aibeit at the expense of sample quality, extremely in low-bit settings. Addressing this, our study introduces a unified Quantization Noise Correction Scheme (QNCD), aimed at minishing quantization noise throughout the sampling process. We identify two primary quantization challenges: intra and inter quantization noise. Intra quantization noise, mainly exacerbated by embeddings in the resblock module, extends activation quantization ranges, increasing disturbances in each single denosing step. Besides, inter quantization noise stems from cumulative quantization deviations across the entire denoising process, altering data distributions step-by-step. QNCD combats these through embedding-derived feature smoothing for eliminating intra quantization noise and an effective runtime noise estimatiation module for dynamicly filtering inter quantization noise. Extensive experiments demonstrate that our method outperforms previous quantization methods for diffusion models, achieving lossless results in W4A8 and W8A8 quantization settings on ImageNet (LDM-4).
Primary Subject Area: [Generation] Generative Multimedia
Relevance To Conference: The primary goal of this work is to accelerate diffusion models, pivotal in the multimedia field and a mainstream method for generation tasks. However, the large model size and computational demands of diffusion models restrict their application in edge devices. This work utilizes model quantization to address this issue, effectively reducing computational demands. Importantly, it avoids excessive degradation of synthetic image quality. The approach extends the application scenarios of diffusion models to more edge devices. Additionally, it enhances generation efficiency, providing a superior user experience.
Supplementary Material: zip
Submission Number: 4020
Loading