Leveraging Disease-Specific Topologies and Counterfactual Relationships in Knowledge Graphs for Inductive Reasoning in Drug Repurposing

Cerag Oguztuzun; Zhenxiang Gao; Hui Li; Rong Xu

Leveraging Disease-Specific Topologies and Counterfactual Relationships in Knowledge Graphs for Inductive Reasoning in Drug Repurposing

Cerag Oguztuzun, Zhenxiang Gao, Hui Li, Rong Xu

Published: 13 Oct 2024, Last Modified: 01 Dec 2024AIDrugX SpotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: drug repurposing, graph machine learning, knowledge graph, graph augmentation, inductive reasoning, foundation models, alzheimer's disease

TL;DR: We present a domain-specific knowledge graph augmentation method leveraging counterfactual relationships for semi-inductive reasoning, enhancing the performance of drug repurposing with improved generalizability and novel drug candidate discovery.

Abstract: Drug repurposing offers a cost-effective strategy to accelerate drug development by identifying new therapeutic uses for approved medications. However, it poses significant challenges for complex diseases with poorly understood mechanisms of action. Addressing these diseases requires the efficient integration of new data while minimizing retraining time, prompting us to develop domain-specific graph augmentation techniques that support semi-inductive reasoning. We discovered that leveraging counterfactual relationships derived from disease-specific topological structures significantly enhances model performance. Based on this insight, we integrated counterfactual relationships as an augmentation method and an initialization step in our knowledge graph (KG) link prediction training process. We introduce KGïA, an inductive KG augmentation method that utilizes counterfactual relationships based on disease-specific topologies. By aligning augmentation with the intrinsic topological features of disease entities, KGïA enhances the KG in a domain-specific manner, facilitating the discovery of a broader range of novel drug candidates tailored to specific diseases. Our biomedical KG comprises 1,614,801 triples and 100,563 biomedical entities, including 30,006 diseases, constructed from 6 biomedical datasets and enriched through Natural Language Processing (NLP) relation extraction. Extensive experiments on this comprehensive KG using 5 augmented architectures demonstrate that semi-inductive reasoning significantly improves generalizability (up to a 24× increase in Mean Reciprocal Rank (MRR)) and that augmented models outperform state-of-the-art KG-based drug repurposing methods (up to a 32\% improvement in MRR). Additionally, in an Alzheimer's Disease (AD) case study, our model identified up to 5 mechanism categories compared to 2 in the baseline, highlighting its enhanced capability to uncover diverse drug candidates.

Submission Number: 75

Loading