Rotamer Density Estimators are Unsupervised Learners of the Effect of Mutations on Protein-Protein InteractionDownload PDF


22 Sept 2022, 12:30 (modified: 13 Nov 2022, 09:34)ICLR 2023 Conference Blind SubmissionReaders: Everyone
Keywords: effect of mutations, protein-protein interaction, unsupervised learning
Abstract: Protein-protein interactions play a fundamental role in a broad range of biological processes. Predicting the effect of amino acid mutations on binding is crucial to protein engineering. Traditional biophysical and statistical methods have dominated the area for years, but they depend heavily on expert prior and face the trade-off between efficiency and accuracy. Recent success in deep learning for proteins has made data-driven approaches more appealing than ever. Nevertheless, the major challenge is the scarcity of experimental mutational data annotated with the change in binding affinity. In this work, we demonstrate that mutational effects on binding can be predicted by the change in conformational flexibility of the protein-protein interface. We propose a flow-based generative model to estimate the probability distribution of conformation (named Rotamer Density Estimator, RDE) and use entropy as the measure of flexibility. The model is trained solely with protein structures and does not require the supervision of the experimental values of changes in binding affinities. Further, the unsupervised representations extracted by the model can be used for prediction even more accurately using simple downstream neural networks. The proposed method outperforms empirical energy functions and other machine learning-based approaches.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Machine Learning for Sciences (eg biology, physics, health sciences, social sciences, climate/sustainability )
10 Replies