Optimal and Approximate Adaptive Stochastic Quantization

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: adaptive quantization, quantization, compression, algorithms, dynamic programming
TL;DR: Optimal and near-optimal methods for unbiasedly quantizing large vectors on the fly while minimizing the mean squared error for the particular input
Abstract: Quantization is a fundamental optimization for many machine learning (ML) use cases, including compressing gradients, model weights and activations, and datasets. The most accurate form of quantization is adaptive, where the error is minimized with respect to a given input rather than optimizing for the worst case. However, optimal adaptive quantization methods are considered infeasible in terms of both their runtime and memory requirements. We revisit the Adaptive Stochastic Quantization (ASQ) problem and present algorithms that find optimal solutions with asymptotically improved time and space complexities. Our experiments indicate that our algorithms may open the door to using ASQ more extensively in a variety of ML applications. We also present an even faster approximation algorithm for quantizing large inputs on the fly.
Primary Area: Optimization (convex and non-convex, discrete, stochastic, robust)
Submission Number: 2480
Loading