Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions

Xiaoran Jiao; Weian Mao; Wengong Jin; Peiyuan Yang; Hao Chen; Chunhua Shen

Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions

Xiaoran Jiao, Weian Mao, Wengong Jin, Peiyuan Yang, Hao Chen, Chunhua Shen

Published: 22 Jan 2025, Last Modified: 05 Mar 2025ICLR 2025 SpotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Mutational Effects; Protein-Protein Interactions

Abstract: Predicting the change in binding free energy ($\Delta \Delta G$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $\Delta\Delta G$ data, existing methods focus on pre-training, while neglecting the importance of alignment. In this work, we propose Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models to prediction of $\Delta\Delta G$. We begin by analyzing the thermodynamic definition of $\Delta\Delta G$ and introducing the Boltzmann distribution to connect energy to the protein conformational distribution. However, the protein conformational distribution is intractable. Therefore, we employ Bayes’ theorem to circumvent direct estimation and instead utilize the log-likelihood provided by protein inverse folding models for the estimation of $\Delta\Delta G$. Compared to previous methods based on inverse folding, our method explicitly accounts for the unbound state of the protein complex in the $\Delta \Delta G$ thermodynamic cycle, introducing a physical inductive bias and achieving supervised and unsupervised state-of-the-art (SoTA) performance. Experimental results on SKEMPI v2 indicate that our method achieves Spearman coefficients of 0.3201 (unsupervised) and 0.5134 (supervised) on SKEMPI v2, significantly surpassing the previously reported %SoTA values SoTA results of 0.2632 and 0.4324, respectively. Furthermore, we demonstrate the capability of our method in binding energy prediction, protein-protein docking, and antibody optimization tasks. Code is available at [https://github.com/aim-uofa/BA-DDG](https://github.com/aim-uofa/BA-DDG)

Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 4282

Loading