Knowledge-guided Protein Complex Identification with Fuzzy-based Graph Representation Learning

Published: 01 Jan 2024, Last Modified: 08 Feb 2025BIBM 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Protein complexes are essential in regulating various cellular processes. A number of computational algorithms have been developed to identify protein complexes from protein-protein interaction (PPI) networks, but they are limited in their ability to effectively leverage diverse biological knowledge of proteins. Additionally, while deep learning-based algorithms perform well in identifying protein complexes, they fail to explicitly capture the dependency between protein embeddings and resulting complexes. To address these challenges, this paper proposes a knowledge-guided protein complex identification algorithm with fuzzy-based graph representation learning, named KPCI-FGRL. In particular, a fuzzy-based graph representation learning framework is developed by KPCI-FGRL to manipulate and fuse network structure with multi-view biological knowledge of proteins. During the training phase of KPCI-FGRL, besides employing self-supervised loss to improve the cohesion of the complexes, we also specifically incorporate the expectation about protein complexes based on fuzzy clustering concept, and thus the dependency between protein embeddings and complexes can be coupled. Furthermore, KPCI-FGRL is capable of achieving the identification of overlapping protein complexes through a heuristic search strategy upon fuzzy memberships of proteins. Extensive experimental results on four different PPI networks collected from two species demonstrate that KPCI-FGRL significantly outperforms several state-of-the-art protein complex identification algorithms.
Loading