Loss-aware Weight Quantization of Deep Networks

Lu Hou; James T. Kwok

Loss-aware Weight Quantization of Deep Networks

Lu Hou, James T. Kwok

15 Feb 2018 (modified: 22 Jun 2025)ICLR 2018 Conference Blind SubmissionReaders: Everyone

Abstract: The huge size of deep networks hinders their use in small computing devices. In this paper, we consider compressing the network by weight quantization. We extend a recently proposed loss-aware weight binarization scheme to ternarization, with possibly different scaling parameters for the positive and negative weights, and m-bit (where m > 2) quantization. Experiments on feedforward and recurrent neural networks show that the proposed scheme outperforms state-of-the-art weight quantization algorithms, and is as accurate (or even more accurate) than the full-precision network.

TL;DR: A loss-aware weight quantization algorithm that directly considers its effect on the loss is proposed.

Keywords: deep learning, network quantization

Code: [![github](/images/github_icon.svg) houlu369/Loss-aware-weight-quantization](https://github.com/houlu369/Loss-aware-weight-quantization)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/loss-aware-weight-quantization-of-deep/code)

7 Replies

Loading