MedQuanBench: Quantization-Aware Analysis for Efficient Medical Imaging Models

MedQuanBench: Quantization-Aware Analysis for Efficient Medical Imaging Models

ICLR 2026 Conference Submission13791 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Quantization, medical imaging, 3D, benchmark, efficiency, sensitivity

Abstract: Quantization is a crucial technology for facilitating the deployment of medical AI models, especially on 3D radiological data. However, existing studies often lack comprehensive evaluations across diverse architectures, modalities, and quantization techniques, which limits our understanding of the real-world trade-offs among applicability, efficiency, and performance. In this work, we introduce MedQuanBench, a large-scale and diverse benchmark designed to rigorously evaluate quantization techniques for 3D medical imaging models. Our benchmark spans a wide range of modern architectures (e.g., CNNs and Transformers). We systematically evaluate representative post-training quantization strategies across model scales and dataset sizes. Additionally, we perform detailed sensitivity analyses to identify which model components are most vulnerable to quantization, including layer-wise degradation and activation distribution shifts. Our results show that 8-bit quantization consistently preserves segmentation accuracy across diverse architectures, making it a reliable choice for deployment. Furthermore, with appropriate configuration, such as selecting proper quantization granularity based on the model structure, 4-bit precision can also achieve near-lossless performance. These results show MedQuanBench as a fundamental benchmark for optimizing quantization strategies and guiding the development of deployment-ready, low-bit medical imaging models.

Supplementary Material: zip

Primary Area: datasets and benchmarks

Submission Number: 13791

Loading