Efficient Knowledge Graph Embedding Framework to Alleviate Data Sparsity for Polypharmacy Side Effects Prediction

Senbo Tu, Zhihao Yang, Lei Wang, Wei Liu, Yin Zhang, Ling Luo, Bo Xu, Jian Wang, Yumeng Yang, Zhehuan Zhao, Hongfei Lin

Published: 01 Jan 2024, Last Modified: 28 Jul 2025BIBM 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Polypharmacy is the combined use of multiple drugs for the treatment of diseases, which also often comes with a higher risk of side effects. In the medical industry, acquiring rich and comprehensive information about the side effects of multiple drug therapy becomes a crucial task. However, data collection for many side effects is often sparse, so the features of these data cannot be adequately learned, resulting in poor performance in side effects prediction. In this paper, we propose a framework based on knowledge graph embedding (KGE) models which improves KGE by using LTE operations and subsampling methods (called LTESampleKGE). LTESampleKGE consists of two main modules i.e., Entity embedding enhancement module and KGE subsampling module. The former applies linear transformation to entity representation instead of GCN structure to enhance entity embedding, while the latter utilizes subsampling methods for KGE negative sampling (NS) loss to pay more attention to sparse data. Thus, LTESampleKGE can effectively alleviate the problem of data sparsity in the polypharmacy side effects prediction task. Experimental evaluations indicate that our method demonstrates superior performance compared with baseline models. For example, LTESampleKGE outperforms MSTE by 1.20% in PR-AUC score on TWOSIDES dataset and by 0.46% in AP@n score on Drugbank dataset.