ACIQ: Analytical Clipping for Integer Quantization of neural networks

Ron Banner; Yury Nahshan; Elad Hoffer; Daniel Soudry

ACIQ: Analytical Clipping for Integer Quantization of neural networks

Ron Banner, Yury Nahshan, Elad Hoffer, Daniel Soudry

27 Sept 2018 (modified: 22 Jun 2025)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: We analyze the trade-off between quantization noise and clipping distortion in low precision networks. We identify the statistics of various tensors, and derive exact expressions for the mean-square-error degradation due to clipping. By optimizing these expressions, we show marked improvements over standard quantization schemes that normally avoid clipping. For example, just by choosing the accurate clipping values, more than 40\% accuracy improvement is obtained for the quantization of VGG-16 to 4-bits of precision. Our results have many applications for the quantization of neural networks at both training and inference time.

Keywords: quantization, reduced precision, training, inference, activation

TL;DR: We analyze the trade-off between quantization noise and clipping distortion in low precision networks, and show marked improvements over standard quantization schemes that normally avoid clipping

Code: [![github](/images/github_icon.svg) submission2019/AnalyticalScaleForIntegerQuantization](https://github.com/submission2019/AnalyticalScaleForIntegerQuantization)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/aciq-analytical-clipping-for-integer/code)

12 Replies

Loading