A proteome-scale masked language model for fast protein-protein interaction prediction

Published: 04 Mar 2024, Last Modified: 29 Apr 2024GEM PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Machine learning: computational method and/or computational results
Cell: I do not want my work to be considered for Cell Systems
Keywords: PPIs, Language Models, Proteins
TL;DR: ProteomeLM, a proteome-scale language model, efficiently identifies protein-protein interactions in Escherichia coli by ranking all potential pairs in a single pass, serving as an effective pre-screening tool for more detailed analyses.
Abstract: Protein-protein interactions (PPIs) play a central role in most biological processes. The large number of potential pairs of interacting proteins makes the determination of PPI networks challenging. High-throughput experimental methods to determine them remain prohibitive beyond some model species. Hence, computational methods are needed to screen these interactions. While some methods have relied on scoring each pair individually, we introduce ProteomeLM, a proteome-scale language model, which can rank all pairs in a single pass. Early results suggest that at least 70% of PPIs in Escherichia coli could be identified in the top 5% best ranked interactions. This method should be useful as a pre-screening tool, allowing to identify the most promising pairs for docking-based or experimental determination of PPI.
Submission Number: 58
Loading