PathwayLM: Multihop Mechanistic Pathways for Biomedical Language Model Reasoning

PathwayLM: Multihop Mechanistic Pathways for Biomedical Language Model Reasoning

ACL ARR 2026 January Submission10702 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Biomedical Reasoning, Neuro-symbolic Methods

Abstract: We introduce a high-throughput framework to semi-automatically construct multihop reasoning datasets in the biomedical domain. We use a neuro-symbolic information extraction (IE) system to extract individual biomedical interactions, followed by a constraint-based path construction algorithm that aggregates complete paths and filters out noise. We use this framework to construct over 5 million semantically consistent 2-hop paths from 4M biomedical publications. We also manually curate 137 paths into a ``gold'' test partition. We use this dataset to evaluate the capacity of LLMs to mechanistically reason in the biomedical domain. Our evaluation shows that: (a) biomedical reasoning remains an open research problem; and (b) a promising practical avenue that doubles reasoning performance is to use the IE system as scaffolding for LLM reasoning.

Paper Type: Short

Research Area: Resources and Evaluation

Research Area Keywords: corpus creation, benchmarking, language resources, automatic creation and evaluation of language resources, NLP datasets, automatic evaluation of datasets, evaluation,

Contribution Types: Data resources

Languages Studied: English

Submission Number: 10702

Loading