MedResearcher-R1: Expert-Level Medical Deep Researcher via A Knowledge-Informed Trajectory Synthesis Framework

ICLR 2026 Conference Submission19683 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Medical Deep Research, Knowledge-Informed Trajectory Synthesis, Multi-hop Medical Reasoning, Tool-Augmented Agents
Abstract: Recent advances in Large Language Model (LLM)-based agents have enabled strong performance in deep research tasks, yet they remain limited in the medical domain. Leading proprietary systems achieve only modest results on complex medical benchmarks, revealing two critical limitations: (1) insufficient dense medical knowledge for clinical reasoning, and (2) a lack of specialized retrieval mechanisms for authoritative medical sources. We introduce MedResearcher-R1, a medical deep research agent that addresses these challenges with two key innovations. First, we propose a novel Knowledge-Informed Trajectory Synthesis (KISA) approach that builds medical knowledge graphs to construct complex multi-hop question–answer pairs around rare medical entities, overcoming the scarcity of high-quality training data. Second, we integrate a custom-built private medical retrieval engine alongside general-purpose tools, enabling accurate and reliable evidence synthesis. Our approach yields over 2,100 diverse trajectories across 12 medical specialties. Trained with supervised fine-tuning and reinforcement learning with composite rewards, our MedResearcher-R1-32B achieves state-of-the-art performance on MedBrowseComp (27.5/50 vs. 25.5/50 for o3-deepresearch) while demonstrates strong general performance on GAIA and xBench benchmarks. To the best of our knowledge, we present the first high-quality, tool-using medical dataset and a domain-specific deep-research agent, together enabling smaller open-source models to outperform much larger proprietary systems in specialized medical tasks.
Supplementary Material: pdf
Primary Area: foundation or frontier models, including LLMs
Submission Number: 19683
Loading