GALAX: Graph-Augmented Language Model for Explainable Reinforcement-Guided Subgraph Reasoning in Precision Medicine

Heming Zhang; Di Huang; Wenyu Li; Michael A Province; Yixin Chen; Philip Payne; Fuhai Li

GALAX: Graph-Augmented Language Model for Explainable Reinforcement-Guided Subgraph Reasoning in Precision Medicine

Heming Zhang, Di Huang, Wenyu Li, Michael A Province, Yixin Chen, Philip Payne, Fuhai Li

Published: 26 Jan 2026, Last Modified: 12 Feb 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement Learning, Large Language Model (LLM), Text-Numeric Graph (TNG), Multi-Omics Integration, Explainability

Abstract: In precision medicine, quantitative multi-omic features, topological context, and textual biological knowledge play vital roles in identifying disease-critical signaling pathways and targets, guiding the discovery of novel therapeutics and effective treatment strategies. Existing pipelines capture only one or two of these—numerical omics ignore topological context, text-centric LLMs lack quantitative grounded reasoning, and graph-only models underuse rich node semantics and the generalization power of LLMs—thereby limiting mechanistic interpretability. Although Process Reward Models (PRMs) aim to guide reasoning in LLMs, they remain limited by coarse step definitions, unreliable intermediate evaluation, and vulnerability to reward hacking with added computational cost. These gaps motivate jointly integrating quantitative multi-omic signals, topological structure with node annotations, and literature-scale text via LLMs, using subgraph reasoning as the principle bridge linking numeric evidence, topological knowledge and language context. To resolve this challenge, we propose GALAX (Graph Augmented LAnguage model with eXplainability), an innovative framework that integrates pretrained Graph Neural Networks (GNNs) into Large Language Models (LLMs) via reinforcement learning guided by a Graph Process Reward Model (GPRM), which generates disease-relevant subgraphs in a step-wise manner initiated by an LLM and iteratively evaluated by a pretrained GNN and schema-based rule check, enabling process-level supervision without explicit labels. As an application, we also introduced Target-QA, a benchmark combining CRISPR-identified targets, multi-omic profiles, and biomedical graph knowledge across diverse cancer cell lines, which enables GNN pretraining for supervising step-wise graph construction and supports long-context reasoning over text-numeric graphs (TNGs), providing a scalable and biologically grounded framework for explainable, reinforcement-guided subgraph reasoning toward reliable and interpretable target and pathway discovery in precision medicine.

Supplementary Material: zip

Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)

Submission Number: 9691

Loading