BindingGYM: A Large-Scale Mutational Dataset Toward Deciphering Protein-Protein Interactions

Published: 13 Oct 2024, Last Modified: 01 Dec 2024AIDrugX PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Mutation effects prediction, Protein Structure, Protein Protein Interaction, Benchmarks
TL;DR: We have curated BindingGYM, the largest database for structure-based predictions of mutational effects on protein-protein interactions
Abstract: Protein-protein interactions are crucial for drug discovery and understanding biological mechanisms. Despite significant advances in predicting the structures of protein complexes, led by AlphaFold3, determining the strength of these interactions accurately remains a challenge. Traditional low-throughput experimental methods do not generate sufficient data for comprehensive benchmarking or training deep learning models. Deep mutational scanning (DMS) experiments provide rich, high-throughput data; however, they are often used incompletely, neglecting to consider the binding partners, and on a per-study basis without assessing the generalization capabilities of fine-tuned models across different assays. To address these limitations, we collected over ten million raw DMS data points and refined them to half a million high-quality points from twenty-five assays, focusing on protein-protein interactions. We intentionally excluded non-PPI DMS data pertaining to intrinsic protein properties, such as fluorescence or catalytic activity. Our dataset meticulously pairs binding energies with the sequences and structures of all interacting partners using a comprehensive pipeline, recognizing that interactions inherently involve at least two proteins. This curated dataset serves as a foundation for benchmarking and training the next generation of deep learning models focused on protein-protein interactions, thereby opening the door to a plethora of high-impact applications including understanding cellular networks and advancing drug target discovery and development.
Submission Number: 32
Loading