Graph representation learning for protein conformation sampling

Published: 19 Oct 2022, Last Modified: 17 Feb 2026OpenReview Archive Direct UploadEveryoneCC BY 4.0
Abstract: Significant research on deep neural networks, culminating in AlphaFold2, convincingly shows that deep learning can predict the native structure of a given protein sequence with high accuracy. In contrast, work on deep learning frameworks that can account for the structural plasticity of protein molecules remains in its infancy. Many researchers are now investigating deep generative models to explore the structure space of a protein. Current models largely use 2D convolution, leveraging representations of protein structures as contact maps or distance matrices. The goal is exclusively to generate protein-like, sequence-agnostic tertiary structures, but no rigorous metrics are utilized to convincingly make this case. This paper makes several contributions. It builds on momentum in graph representation learning and formalizes a protein tertiary structure as a contact graph. It demonstrates that graph representation learning outperforms models based on image convolution. This work also equips graph-based deep latent variable models with the ability to learn from experimentally-available tertiary structures of proteins of varying lengths. The resulting models are shown to outperform state-of-the-art ones on rigorous metrics that quantify both local and distal patterns in physically-realistic protein structures. We hope this work will spur further research in deep generative models for obtaining a broader view of the structure space of a protein molecule.
Loading