Fast and Accurate Fisher-Guided Quantization via Efficient Kronecker Factor Approximation

16 Sept 2025 (modified: 04 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Quantization, Large Language Models, Fisher Information Matrix, Kronecker-Factored Approximation, Model Compression, Second-Order Methods
TL;DR: We propose an accelerated Fisher information decomposition for the YAQA method, achieving the same quantization accuracy with ~10× lower computational cost.
Abstract: Quantization with second-order information has shown strong promise for preserving model quality under aggressive compression. Building on the recent YAQA framework, which employs Kronecker-factored approximations of the Hessian via a power-iteration technique, we propose an alternative approach that replaces this step with a more efficient Kronecker decomposition method from GFWSVD. This formulation preserves the benefits of second-order curvature-aware quantization while substantially reducing computational cost. We apply our method to LLaMA-2 7B, LLaMA-3 8B Instruct, Qwen 3 8B Instruct and demonstrate that it achieves the same post-quantization model quality as YAQA, but with significantly faster computational process — the Kronecker factors which provide the required quality was obtained with 10 times fewer tokens and approximately a $10\times$ speedup over the original work.
Supplementary Material: zip
Primary Area: optimization
Submission Number: 8018
Loading