CARE: Confidence-Aware REasoning for Reliable Medical-VQA

16 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Reinforcement Finetuning; Medical VQA; Confidence Sampling
TL;DR: We developed CARE, a medical MLLM for reliable diagnosis. Using Chain-of-Thought and confidence-aware training, it delivers more accurate, trustworthy, and interpretable answers for medical visual questions.
Abstract: Multimodal Large Language Models (MLLMs) have made significant progress in the medical field, yet their insufficient diagnostic reliability remains a major barrier to clinical application. To address this issue, we propose CARE—a novel MLLM for the medical Visual Question Answering (VQA) task, which integrates Chain-Of-Thought (CoT) reasoning and confidence awareness into its training. CARE achieves reliable diagnosis through the following approaches: First, it employs CoT to simulate the diagnostic reasoning process of physicians during Supervised Fine-Tuning (SFT). Second, it incorporates confidence estimation into the reward function of Reinforcement Fine-Tuning (RFT), significantly enhancing both answer accuracy and reasoning trustworthiness. Experimental results demonstrate that CARE consistently outperforms existing methods across multiple Medical-VQA benchmarks and exhibits strong generalization capabilities in diverse medical scenarios, which confirm that CARE not only substantially improves task accuracy but also enhances model reliability, while delivering answers with superior interpretability.
Supplementary Material: zip
Primary Area: reinforcement learning
Submission Number: 7414
Loading