Abstract: The applications of LLMs in various biological domains have been explored recently, but their reasoning ability in complex biological pathways remains underexplored, which is crucial for predicting biological phenomena, formulating hypotheses, and designing experiments. This work explores the potential of LLMs in pathway reasoning. We introduce BioMaze, a dataset with 5.1K complex pathway problems derived from real research, covering various biological contexts including natural dynamic changes, disturbances, additional intervention conditions, and multi-scale research targets. Our evaluation of methods such as CoT and graph-augmented reasoning shows that LLMs struggle with pathway reasoning, especially in perturbed systems. To address this, we propose PathSeeker, an LLM agent that enhances reasoning through interactive subgraph-based navigation, enabling a more effective approach to handling the complexities of biological systems in a scientifically aligned manner.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Large Language Model; Reasoning; AI4Science
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources
Languages Studied: English
Keywords: Large Language Model; Reasoning; AI4Science
Submission Number: 1031
Loading