Keywords: Scientific reasoning, Knowledge graphs, Iterative multi-step reasoning, Benchmark construction
Abstract: Large language models (LLMs) are increasingly used for scientific question answering, yet genomics reasoning still suffers from limited reasoning capability and insufficient explainability.
Moreover, under a controlled knowledge-increment setting, it remains unclear which knowledge channel is truly necessary to obtain correct answers.
To address these challenges, we propose X-R2, an iterative reasoning framework that performs mandatory question decomposition given a query $Q$ and optional document evidence $\mathcal{D}$, while explicitly separating three additional knowledge channels:
(i) entity-centric knowledge,
(ii) KG triple knowledge, and
(iii) parametric knowledge.
At each iteration, X-R2 extracts step-level entities and constraints, acquires evidence from the enabled channels, and triggers re-decomposition when a self-check detects missing or inconsistent support, thereby improving both accuracy and explainability.
To enable fine-grained attribution, we construct X-R2 Bench.
Experiments on this benchmark show that X-R2 consistently improves performance over direct one-shot generation and confirm that instance-aligned KG triples provide the largest marginal gains.
Paper Type: Long
Research Area: Retrieval-Augmented Language Models
Research Area Keywords: pragmatic inference and reasoning
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 10186
Loading