PACT: Parameterized Clipping Activation for Quantized Neural Networks

Jungwook Choi; Zhuo Wang; Swagath Venkataramani; Pierce I-Jen Chuang; Vijayalakshmi Srinivasan; Kailash Gopalakrishnan

PACT: Parameterized Clipping Activation for Quantized Neural Networks

Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan

15 Feb 2018 (modified: 22 Jun 2025)ICLR 2018 Conference Blind SubmissionReaders: Everyone

Abstract: Deep learning algorithms achieve high classification accuracy at the expense of significant computation cost. To address this cost, a number of quantization schemeshave been proposed - but most of these techniques focused on quantizing weights, which are relatively smaller in size compared to activations. This paper proposes a novel quantization scheme for activations during training - that enables neural networks to work well with ultra low precision weights and activations without any significant accuracy degradation. This technique, PArameterized Clipping acTi-vation (PACT), uses an activation clipping parameter α that is optimized duringtraining to find the right quantization scale. PACT allows quantizing activations toarbitrary bit precisions, while achieving much better accuracy relative to publishedstate-of-the-art quantization schemes. We show, for the first time, that both weights and activations can be quantized to 4-bits of precision while still achieving accuracy comparable to full precision networks across a range of popular models and datasets. We also show that exploiting these reduced-precision computational units in hardware can enable a super-linear improvement in inferencing performance dueto a significant reduction in the area of accelerator compute engines coupled with the ability to retain the quantized model and activation data in on-chip memories.

TL;DR: A new way of quantizing activation of Deep Neural Network via parameterized clipping which optimizes the quantization scale via stochastic gradient descent.

Keywords: deep learning, quantized deep neural network, activation quantization

Code: [![Papers with Code](/images/pwc_icon.svg) 3 community implementations](https://paperswithcode.com/paper/?openreview=By5ugjyCb)

Data: [CIFAR-10](https://paperswithcode.com/dataset/cifar-10), [ImageNet](https://paperswithcode.com/dataset/imagenet), [SVHN](https://paperswithcode.com/dataset/svhn)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/pact-parameterized-clipping-activation-for/code)

12 Replies

Loading