Outlier-Aware Post-Training Quantization for Discrete Graph Diffusion Models

Zheng Gong; Ying Sun

Outlier-Aware Post-Training Quantization for Discrete Graph Diffusion Models

Zheng Gong, Ying Sun

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC 4.0

Abstract: Discrete Graph Diffusion Models (DGDMs) mark a pivotal advancement in graph generation, effectively preserving sparsity and structural integrity, thereby enhancing the learning of graph data distributions for diverse generative applications. Despite their potential, DGDMs are computationally intensive due to the numerous low-parameter yet high-computation operations, thereby increasing the need of inference acceleration. A promising solution to mitigate this issue is model quantization. However, existing quantization techniques for Image Diffusion Models (IDMs) face limitations in DGDMs due to differing diffusion processes, while Large Language Model (LLM) quantization focuses on reducing memory access latency of loading large parameters, unlike DGDMs, where inference bottlenecks are computations due to smaller model sizes. To fill this gap, we introduce Bit-DGDM, a post-training quantization framework for DGDMs which incorporates two novel ideas: (i) sparse-dense activation quantization sparsely modeling the activation outliers through adaptively selected, data-free thresholds in full-precision and quantizing the remaining to low-bit, and (ii) ill-conditioned low-rank decomposition decomposing the weights into low-rank component enable faster inference and an $\alpha$-sparsity matrix that models outliers. Extensive experiments demonstrate that Bit-DGDM not only reducing the memory usage from the FP32 baseline by up to $2.8\times$ and achieve up to $2.5\times$ speedup, but also achieve comparable performance to ultra-low precision of up to 4-bit.

Lay Summary: Discrete Graph Diffusion Models (DGDMs) significantly advance graph generation by preserving sparsity and structural integrity, improving learning of graph data distributions, but their many low-parameter yet high-computation operations make them computationally intensive, increasing the need for inference acceleration. We introduce Bit-DGDM, an outlier-aware post-training quantization framework for DGDMs, which enables faster inference and reduces memory usages.

Link To Code: https://github.com/KellyGong/BitDGDM

Primary Area: Deep Learning->Graph Neural Networks

Keywords: Post-Training Quantization, Discrete Graph Diffusion Models

Submission Number: 9670

Loading