Keywords: Network Quantization, Zero-shot Quantization
Abstract: How can we accurately quantize a pre-trained model without any data?
Quantization algorithms are widely used for deploying neural networks on resource-constrained edge devices.
Zero-shot Quantization (ZSQ) addresses the crucial and practical scenario where training data are inaccessible for privacy or security reasons.
However, three significant challenges hinder the performance of existing ZSQ methods: 1) noise in the synthetic dataset, 2) predictions based on off-target patterns, and the 3) misguidance by erroneous hard labels.
In this paper, we propose SynQ (Synthesis-aware Fine-tuning for Zero-shot Quantization),
a carefully designed ZSQ framework to overcome the limitations of existing methods.
SynQ minimizes the noise from the generated samples by exploiting a low-pass filter.
Then, SynQ trains the quantized model to improve accuracy by aligning its class activation map with the pre-trained model.
Furthermore, SynQ mitigates misguidance from the pre-trained model's error by leveraging only soft labels for difficult samples.
Extensive experiments show that SynQ provides the state-of-the-art accuracy, over existing ZSQ methods.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6292
Loading