Keywords: Monocular Depth Estimation, Post-Training Quantization
Abstract: Monocular Depth Estimation (MDE) foundation models such as Depth Anything achieve strong generalization across diverse scenes, but their high computational and memory costs hinder efficient deployment.
Post-Training Quantization (PTQ) offers a practical compression strategy, yet low-bit PTQ for MDE remains challenging because quantizing query and key projections independently fails to preserve their induced attention maps, while accumulated quantization errors cause distribution shifts in intermediate features.
To address these challenges, we propose SPA-Q, a Structure-Preserving Adaptive PTQ framework for MDE.
SPA-Q introduces Attention-Preserving Calibration (APC), which calibrates quantization parameters by directly preserving the full-precision attention distributions, and Channel-Wise Distribution Alignment (CWDA), which mitigates quantization-induced feature distribution shifts through channel-wise affine transformations that are absorbed into the weights after training.
Experiments on NYUv2 and KITTI show that SPA-Q consistently improves 4-bit quantization performance over existing PTQ methods, reducing AbsRel by 29.5\% and improving $\delta_1$ by 20.2\% on average.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 45
Loading