Internal-Coordinate Density Modelling of Protein Structure: Covariance Matters

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Machine Learning, Structural Biology, Internal-Coordinate Density Modelling, Generative Modelling
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: After the recent ground-breaking advances in protein structure prediction, one of the remaining challenges in protein machine learning is to reliably predict distributions of structural states. Parametric models of fluctuations are difficult to fit due to complex covariance structures between degrees of freedom in the protein chain, often causing models to either violate local or global structural constraints. In this paper, we present a new strategy for modelling protein densities in internal coordinates, which uses constraints in 3D space to induce covariance structure between the internal degrees of freedom. We illustrate the potential of the procedure by constructing a variational autoencoder with full covariance output induced by the constraints implied by the conditional mean in 3D, and demonstrate that our approach makes it possible to scale density models of internal coordinates to full protein backbones in two settings: 1) a unimodal setting for proteins exhibiting small fluctuations and limited amounts of available data, and 2) a multimodal setting for larger conformational changes in a high data regime.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3708
Loading