Double Rounding Quantization for Flexible Deep Neural Network Compression

Haiduo Huang; Zhenhua Liu; Kai Han; Pengju Ren; Yunhe Wang

Double Rounding Quantization for Flexible Deep Neural Network Compression

Haiduo Huang, Zhenhua Liu, Kai Han, Pengju Ren, Yunhe Wang

23 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: Model Quantization, Double Rounding, Mixed-Precision Super-Net

Abstract: Model quantization is widely applied for compression and acceleration of deep neural networks, due to its simplification and adaptability. The quantization bit-width is typically predefined for quantizing a given neural network. However, the bit-width settings vary in different hardware and transmission demands, which will induce considerable training and storage costs. Therefore, the scheme of once-joint training for multiple bit-widths (multi-bit) is proposed to address this issue. In this paper, we propose a Double Rounding quantization method that can save the highest bit-width model instead of the full-precision counterpart and fully exploits the representation value range. Nevertheless, the performance during once-joint training degrades significantly due to inconsistent gradients between high-bit and low-bit quantization. To tackle this problem, we set the learning rate of multi-bit to proper values in an adaptive manner during training. We also apply our method for mixed-precision super-net and provide a novel training strategy with weighted probability. Experimental results demonstrate the proposed method outperforms the SOTA once-joint quantization-aware methods on ImageNet datasets. The code will be available soon.

Supplementary Material: pdf

Primary Area: general machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 7387

Loading