BioMaze: Benchmarking and Enhancing Large Language Models for Biological Pathway Reasoning

ACL ARR 2025 May Submission1031 Authors

16 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The applications of LLMs in various biological domains have been explored recently, but their reasoning ability in complex biological pathways remains underexplored, which is crucial for predicting biological phenomena, formulating hypotheses, and designing experiments. This work explores the potential of LLMs in pathway reasoning. We introduce BioMaze, a dataset with 5.1K complex pathway problems derived from real research, covering various biological contexts including natural dynamic changes, disturbances, additional intervention conditions, and multi-scale research targets. Our evaluation of methods such as CoT and graph-augmented reasoning shows that LLMs struggle with pathway reasoning, especially in perturbed systems. To address this, we propose PathSeeker, an LLM agent that enhances reasoning through interactive subgraph-based navigation, enabling a more effective approach to handling the complexities of biological systems in a scientifically aligned manner.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Large Language Model; Reasoning; AI4Science
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources
Languages Studied: English
Keywords: Large Language Model; Reasoning; AI4Science
Submission Number: 1031
Loading