On the Expressive Power of Weight Quantization in Deep Neural Networks

Shao-Qun Zhang

On the Expressive Power of Weight Quantization in Deep Neural Networks

Shao-Qun Zhang

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Deep Neural Networks, Weight Quantization, Expressivity, Universal Approximation, Expressive Degradation

TL;DR: This paper conducts a theoretical investigation into the expressive capability of deep neural networks relative to the number of quantization bits.

Abstract: In recent years, weight quantization, which encodes the connection weights of neural networks in an $n$-bit format, has garnered significant attention due to its potential for model compression. Many implementation techniques have been developed; however, the theoretical understanding of many aspects, especially the approximation and degradation of expressive power as the number of quantization bits decreases, remains unclear. In this paper, we conduct a theoretical investigation into the expressive capability of deep neural networks relative to the number of quantization bits. We establish the universal approximation property of quantized neural networks with linear width and exponential depth. Additionally, we confirm that weight quantization leads to expressive degradation, in which the expressive capacity of quantized neural networks degrades polynomially as the number of quantization bits decreases. These theoretical findings provide a solid foundation for advancing weight quantization in the context of scaling laws and shed insights for future research in model compression and acceleration.

Supplementary Material: pdf

Primary Area: learning theory

Submission Number: 8793

Loading