AOR: Anatomical Ontology-Guided Reasoning for Medical Large Multimodal Model in Chest X-Ray Interpretation

Qingqiu Li; Zihang Cui; Seongsu Bae; Jilan Xu; Runtian Yuan; Yuejie Zhang; Rui Feng; Quanli Shen; Xiaobo Zhang; Shang Gao; Junjun He; Shujun Wang

AOR: Anatomical Ontology-Guided Reasoning for Medical Large Multimodal Model in Chest X-Ray Interpretation

Qingqiu Li, Zihang Cui, Seongsu Bae, Jilan Xu, Runtian Yuan, Yuejie Zhang, Rui Feng, Quanli Shen, Xiaobo Zhang, Shang Gao, Junjun He, Shujun Wang

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Medical Large Multimodal Models; Chain of Thought; Anatomical Ontology; Chest X-rays; Medical Visual Question Answering

Abstract: Chest X-rays (CXRs) are the most frequently performed imaging examinations in clinical settings. Recent advancements in Medical Large Multimodal Models (MLMMs) have enabled automated CXR interpretation, improving diagnostic accuracy and efficiency. However, despite their strong visual understanding, current MLMMs still face two major challenges: (1) insufficient region-level understanding and interaction, and (2) limited accuracy and interpretability due to single-step prediction. In this paper, we address these challenges by empowering MLMMs with anatomy-centric reasoning capabilities to enhance their interactivity and explainability. Specifically, we propose an Anatomical Ontology-Guided Reasoning (AOR) framework that accommodates both textual and optional visual prompts, centered on region-level information to enable multimodal multi-step reasoning. We also develop AOR-Instruction, a large instruction dataset for MLMMs training, under the guidance of expert physicians. Our experiments demonstrate AOR's superior performance in both Visual Question Answering (VQA) and report generation tasks. Code and data are available at: https://github.com/Liqq1/AOR.

Supplementary Material: zip

Primary Area: Machine learning for sciences (e.g. climate, health, life sciences, physics, social sciences)

Submission Number: 12783

Loading