Keywords: Deep Neural Networks, Weight Quantization, Expressivity, Universal Approximation, Expressive Degradation
TL;DR: This paper conducts a theoretical investigation into the expressive capability of deep neural networks relative to the number of quantization bits.
Abstract: In recent years, weight quantization, which encodes the connection weights of neural networks in an $n$-bit format, has garnered significant attention due to its potential for model compression. Many implementation techniques have been developed; however, the theoretical understanding of many aspects, especially the approximation and degradation of expressive power as the number of quantization bits decreases, remains unclear. In this paper, we conduct a theoretical investigation into the expressive capability of deep neural networks relative to the number of quantization bits. We establish the universal approximation property of quantized neural networks with linear width and exponential depth. Additionally, we confirm that weight quantization leads to expressive degradation, in which the expressive capacity of quantized neural networks degrades polynomially as the number of quantization bits decreases. These theoretical findings provide a solid foundation for advancing weight quantization in the context of scaling laws and shed insights for future research in model compression and acceleration.
Supplementary Material: pdf
Primary Area: learning theory
Submission Number: 8793
Loading