Keywords: quantization, straight-through estimator, quantization-aware training
Abstract: We study the problem of training neural networks with quantized parameters.
Learning low-precision quantized parameters by enabling computation of gradients via the Straight-Through Estimator (STE) can be challenging.
While the STE enables back-propagation, which is a first-order method, recent works have explored the use of zeroth-order (ZO) gradient descent for fine-tuning.
We note that the STE provides high-quality biased gradients, and ZO gradients are unbiased but can be expensive.
We thus propose First-Order-Guided Zeroth-Order Gradient Descent (FOGZO) that reduces STE bias while reducing computations relative to ZO methods.
Empirically, we show FOGZO improves the tradeoff between quality and training time in Quantization-Aware Pre-Training.
Specifically, versus STE at the same number of iterations, we show a 1-8% accuracy improvement for DeiT Tiny/Small, 1-2% accuracy improvement on ResNet 18/50, and 1-22 perplexity point improvement for LLaMA models with up to 0.3 billion parameters. For the same loss, FOGZO yields a 796$\times$ reduction in computation versus n-SPSA for a 2-layer MLP on MNIST. Code is available at [https://github.com/1733116199/fogzo](https://github.com/1733116199/fogzo).
Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)
Submission Number: 6228
Loading