SpQAT: A Sparse Quantization-Aware Training MethodDownload PDF

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone
Keywords: efficient training, quantization-aware training, network quantization
TL;DR: We develop an efficient sparse QAT method, dubbed SpQAT, based on the partly scratch-off lottery ticket phenomenon we observed.
Abstract: Quantization-aware training (QAT) has been demonstrated to not only reduce computational cost and storage footprint, but well retain the performance of full-precision neural networks. However, the tedious retraining requirement greatly weakens the practical value of QAT methods. In this paper, we attempt to reduce the training costs of QAT methods, which to our best knowledge are barely investigated in the literature. Our motive stands upon a straightforward-yet-valuable observation: A large portion of quantized weights, referred to as the partly scratch-off lottery ticket, reach the optimal quantization level after a few training epochs. This naturally inspires us to reduce computation by freezing these weights in the remaining training period. Accordingly, we develop an efficient sparse QAT method, dubbed SpQAT. It freezes a weight once the distance between the full-precision one and its quantization level is smaller than a controllable threshold. Along these lines, we show that the proposed SpQAT accurately identifies the partly scratch-off lottery ticket and results in a sparse weight gradient where many weights are pulled out of the training and their related computations are avoided. Extensive experiments demonstrate the efficacy of our SpQAT with 20%-60% weight gradient sparsity. With the elimination of related gradient calculation in the backward propagation, the performance of our SpQAT is still on par with or even better than the compared baseline.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Supplementary Material: zip
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Applications (eg, speech processing, computer vision, NLP)
5 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview