Biological Pathway Informed Foundation Models with Graph Attention Networks (GATs)

29 Aug 2025 (modified: 11 Oct 2025)Submitted to NeurIPS 2025 2nd Workshop FM4LSEveryoneRevisionsBibTeXCC BY 4.0
Keywords: graph attention network, biological pathway informed model, time-series gene expression, gene regulatory networks
TL;DR: A Graph Attention Network (GAT) method for building biological pathway informed foundations models that learn more generalizable and interpretable pathway dynamics, and can discover new insights from multi-modal gene data.
Abstract: Biological pathways map gene–gene interactions that govern all human processes. Despite their importance, most ML models treat genes as unstructured tokens, discarding known pathway structure. The latest pathway-informed models capture pathway-pathway interactions, but still treat each pathway as a ``bag of genes" via MLPs, discarding its topology and gene-gene interactions. We propose a Graph Attention Network (GAT) framework that models pathways at the gene level. We show preliminary results that GATs generalize much better than MLPs, achieving an 81\% reduction in MSE when predicting pathway dynamics under unseen treatments. We further validate the correctness of our biological prior by encoding drug mechanisms via edge interventions. Finally, we show that our GAT model is able to correctly rediscover all five gene-gene interactions in the canonical TP53-MDM2-MDM4 feedback loop from raw mRNA expression data, demonstrating potential to generate novel biological hypotheses.This work positions GATs as pathway-informed foundation models that can generate richer pathway embeddings from multi-modal gene data. [All code will be released upon publication.]
Submission Number: 10
Loading