Keywords: Zero-shot Quantization, Post-training quantization, mixed-precision quantization
TL;DR: A jointly optimization framework for post-training quantization that alternatively optimize the calibration data and model bit-widths
Abstract: Mixed-precision quantization (MPQ) aims to identify optimal bit-widths for layers to quantize a model.
On the other hand,
zero-shot quantization (ZSQ) aims to learn a quantized model from a pre-trained full-precision model in a data-free manner, which is commonly done by generating a synthetic calibration set used for quantizing the full-precision model. While it is intuitive that there exists inherent correlation between the quality of the generated calibration dataset
and the bit allocation to the model's layers,
all existing frameworks treat them as separate problems. This paper proposes a novel method that jointly optimizes both the calibration set and the bit-width of each layer in the context of zero-shot quantization. Specifically, we first propose a novel data optimization approach that takes into consideration the Gram-Gradient matrix constructed from the gradient vectors of calibration samples. We then propose a novel scalable quadratic optimization-based approach to identify the model's bit-widths. These proposals will then be combined into a single framework to jointly optimize both the calibration data and the bit allocation to the model's layers.
Experimental results on the ImageNet dataset demonstrate the proposed method's superiority compared to current state-of-the-art techniques in ZSQ.
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6818
Loading