2D Quantization for Ultra‑low‑bit Optimizers

16 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: memory efficient, first-order optimizer, quantization, 1.5bit, 2bit
TL;DR: We propose the first 1.5-bit and 2-bit first-order optimizers for neural network pretraining that match the performance of their 16/32-bit counterparts.
Abstract: Optimizer states used to accelerate neural network training become a significant memory bottleneck as model size grows. A common mitigation is to compress these high-precision states to low-bit representations, but existing methods typically stop at 4 bits. In this paper, we push the bitwidth of AdamW and Adafactor states down to 1.5 and 2 bits by mapping high-precision values to their nearest low-bit representations in a two-dimensional (2D) polar space, which we call 2D quantization. This is effective because optimizer states exhibit a quasi-Gaussian distribution with strong circular symmetry. To further improve efficiency, we offer concrete design principles for both signed and unsigned data, and we validate the superiority of our approach over traditional 1D quantization through static experiments on real momentum matrices. Across a range of pretraining and fine-tuning benchmarks—including image classification and natural language modeling—our ultra-low-bit AdamW and Adafactor match the performance of their 16/32-bit counterparts while dramatically reducing memory usage.
Primary Area: optimization
Submission Number: 7148
Loading