PathSymetic: Neuro-Symbolic Causal Hypothesis Generation for Mechanistic Pathway Discovery in Genomic Systems
Keywords: Neuro-symbolic causal inference; Gene-to-pathway causal mapping; Ontology-guided reasoning; Biological knowledge graphs; Pathway-level hypothesis generation; Interventional transcriptomics (CRISPR, perturbations)
TL;DR: PathSymetic is a neuro-symbolic framework that integrates LLMs, biological ontologies, and causal learning to infer interpretable, pathway-level mechanisms from high-dimensional transcriptomic data.
Abstract: Identifying causal relationships between gene-level signals
and biological pathways remains a core challenge in functional genomics, particularly under high-dimensional and
noisy transcriptomic data. PathSymetic is a neuro-symbolic
framework that integrates large language models (LLMs),
ontology-grounded knowledge graphs, and causal structure
learning to infer interpretable pathway-level hypotheses. It
combines symbolic reasoning and neural representations in
three stages: (1) Ontology-guided symbolic grounding, where
pathway and reaction metadata are structured into logical
graphs; (2) Causal representation alignment, where interventional transcriptomic data (e.g., CRISPR, small-molecule perturbations) are used to learn causal attributions via counterfactual probing; and (3) Concept-level hypothe-sis generation, where symbolic rules are merged with LLM-derived
latent embeddings to yield ranked mechanistic pathway hypotheses. Fine-tuned on benchmark datasets from cancer and
metabolic diseases, PathSymetic achieved an AUPR of 0.81,
F1-score of 0.77, and Precision@10 of 0.94, outperforming
attention-based GNNs (AUPR 0.68) and pathway enrichment
baselines (F1 0.52). It further achieves Hit@10 of 98.4 percent and Hit@20 of 99.6 percent, highlighting its ability to
rank experimentally validated pathways among top candidates. It prioritizes experimentally validated pathways among
top predictions and uncovers biologically plausible novel hypotheses, supported by co-citation analysis and mechanistic
literature.
Submission Number: 16
Loading