Joint learning of logic relationships for studying protein function using phylogenetic profiles and the rosetta stone method

Published: 2006, Last Modified: 13 Nov 2024IEEE Trans. Signal Process. 2006EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Identifying logic relationships between proteins is essential for understanding their function within cells. Previous studies have been done to infer protein logic relationships using pairwise and triplet logic analysis on phylogenetic profiles. Other computational methods have also been developed using pairwise analysis on Rosetta Stone data to infer protein functional linkages. (Proteins that share the same metabolic pathway or a common structural complex are said to be functionally linked.) This paper describes a Bayesian modeling framework for combining phylogenetic profile data via a likelihood with Rosetta Stone data via a prior probability. Based on the proposed framework, a general method is developed for jointly learning high-order logic relationships among proteins whose presence or absence can be identified by logic functions. The method is applied to analyze protein triplets and quartets on phylogenetic profile and Rosetta Stone data sets with 140 clusters of orthologous genes (COGs). The biological meaning of the top 30 significant triplets are further verified using the KEGG and NCBI databases. Over 50% of the discovered relationships that are associated with high significant scores could not be inferred using phylogenetic profile or Rosetta Stone data alone. The statistical analysis in this paper shows that all significant quartets have p-values /spl les/5.71E-04. Many of them assign putative functional roles on uncharacterized proteins.
Loading