Implicit Off-Diagonal Curvature Modeling via Gradient Projection for Post-Training Quantization of Vision Transformers

Published: 01 Jun 2026, Last Modified: 04 Jun 2026AdaptFM PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Vision Transformer Quantization
Abstract: In this work, we propose Gradient-Projected Fisher Approximation for Quantization (GPFA-Q), a block reconstruction-based PTQ framework that avoids explicit curvature matrix construction while capturing off-diagonal interactions. First, we introduce Gradient-Projected Reconstruction (GPR), which reformulates the Fisher quadratic objective as gradient projections, enabling implicit modeling of cross-dimensional interactions. To further support GPR, we integrate Soft Grid Rounding (SGR), which reduces the mismatch between continuous reconstruction and discrete inference, ensuring that gradient projections remain consistent with the quantized model. Extensive experiments demonstrate that our GPFA-Q achieves the state-of-the-art performance in low-bit quantization across diverse vision tasks.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 52
Loading