Calibration-Aware Prompt Learning for Medical Vision-Language Models

Abhishek Basu, Fahad Shamshad, Ashshak Sharifdeen, Karthik Nandakumar, Muhammad Haris Khan

Published: 15 Jul 2025, Last Modified: 12 Nov 2025The 36th British Machine Vision Conference 2025 (BMVC 2025)EveryoneCC BY 4.0

Abstract: Medical Vision-Language Models (Med-VLMs) have demonstrated remarkable performance across diverse medical imaging tasks by leveraging large-scale image-text pretraining. However, their confidence calibration is largely unexplored, and so remains a significant challenge. As such, miscalibrated predictions can lead to overconfident errors, undermining clinical trust and decision-making reliability. To address this, we introduce \texttt{CalibPrompt}, the first framework to calibrate Med-VLMs during prompt tuning. \texttt{CalibPrompt} optimizes a small set of learnable prompts with carefully designed calibration objectives under scarce labeled data regime. First, we study a regularizer that attempts to align the smoothed accuracy with the predicted model confidences. Second, we introduce an angular separation loss to maximize textual feature proximity toward improving the reliability in confidence estimates of multimodal Med-VLMs. Extensive experiments on four publicly available Med-VLMs and five diverse medical imaging datasets reveal that \texttt{CalibPrompt} consistently improves calibration without drastically affecting clean accuracy. Our code is available at \url{https://github.com/iabh1shekbasu/CalibPrompt}.