PathSymetic: Neuro-Symbolic Causal Hypothesis Generation for Mechanistic Pathway Discovery in Genomic Systems

Published: 18 Nov 2025, Last Modified: 18 Nov 2025SPARTA_AAAI2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Neuro-symbolic causal inference; Gene-to-pathway causal mapping; Ontology-guided reasoning; Biological knowledge graphs; Pathway-level hypothesis generation; Interventional transcriptomics (CRISPR, perturbations)
TL;DR: PathSymetic is a neuro-symbolic framework that integrates LLMs, biological ontologies, and causal learning to infer interpretable, pathway-level mechanisms from high-dimensional transcriptomic data.
Abstract: Identifying causal relationships between gene-level signals and biological pathways remains a core challenge in functional genomics, particularly under high-dimensional and noisy transcriptomic data. PathSymetic is a neuro-symbolic framework that integrates large language models (LLMs), ontology-grounded knowledge graphs, and causal structure learning to infer interpretable pathway-level hypotheses. It combines symbolic reasoning and neural representations in three stages: (1) Ontology-guided symbolic grounding, where pathway and reaction metadata are structured into logical graphs; (2) Causal representation alignment, where interventional transcriptomic data (e.g., CRISPR, small-molecule perturbations) are used to learn causal attributions via counterfactual probing; and (3) Concept-level hypothe-sis generation, where symbolic rules are merged with LLM-derived latent embeddings to yield ranked mechanistic pathway hypotheses. Fine-tuned on benchmark datasets from cancer and metabolic diseases, PathSymetic achieved an AUPR of 0.81, F1-score of 0.77, and Precision@10 of 0.94, outperforming attention-based GNNs (AUPR 0.68) and pathway enrichment baselines (F1 0.52). It further achieves Hit@10 of 98.4 percent and Hit@20 of 99.6 percent, highlighting its ability to rank experimentally validated pathways among top candidates. It prioritizes experimentally validated pathways among top predictions and uncovers biologically plausible novel hypotheses, supported by co-citation analysis and mechanistic literature.
Submission Number: 16
Loading