InfiMed: Low-Resource Medical MLLMs with Advancing Understanding and Reasoning

InfiMed: Low-Resource Medical MLLMs with Advancing Understanding and Reasoning

ACL ARR 2026 January Submission5768 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: MLLM, Medical, Post-Training

Abstract: Multimodal Large Language Models (MLLMs) have achieved strong performance in general visual understanding and reasoning; however, their progress in the medical domain remains constrained by the scarcity of informative multimodal medical data and the limited effectiveness of Reinforcement Learning with Verifiable Rewards (RLVR). Moreover, existing work often lacks an in-depth exploration of multimodal medical tasks. To address these issues, during supervised fine-tuning (SFT), we jointly incorporate high-quality textual reasoning data, general multimodal data, and multimodal medical data to enhance foundational medical knowledge while preserving the base model’s reasoning capability. Furthermore, to mitigate sparse-information scenarios common in medical datasets, we synthesize reflective-pattern-injected chain-of-thought (CoT) data in addition to standard CoT, endowing the model with structured reflective reasoning and providing a strong initialization for subsequent RLVR training. Based on this training paradigm, we introduce the InfiMed-Series, including InfiMed-SFT-3B and InfiMed-RL-3B, which achieve state-of-the-art performance across seven multimodal medical benchmarks. Notably, InfiMed-RL-3B attains an average accuracy of 59.2\%, outperforming larger models such as InternVL3-8B (57.3\%), while using only 188K SFT samples and 36K RLVR samples. Finally, we conduct extensive experiments to explore a range of fundamental research questions regarding data composition, reasoning strategies, and training paradigms in multimodal medical models. Our findings provide meaningful insights for the future development of medical MLLMs.

Paper Type: Long

Research Area: Clinical and Biomedical Applications

Research Area Keywords: NLP Applications

Languages Studied: English,Chinese

Submission Number: 5768

Loading