Abstract: Proteins are dynamic macromolecules whose functions are intricately linked to their structural flexibility. Recent breakthroughs in deep learning have enabled the accurate prediction of static protein structures. However, understanding protein function is more complex. It often requires access to a diverse ensemble of conformations. Traditional sampling techniques exist to help with this. These include molecular dynamics and Monte Carlo simulations. These techniques can explore conformational landscapes. However, they have limitations as they are often limited by high computational cost and suffer from slow convergence. In response, deep generative models (DGMs) have emerged as a powerful alternative for efficient and scalable protein conformation sampling. Leveraging architectures such as variational autoencoders, normalizing flows, generative adversarial networks, and diffusion models, DGMs can learn complex, high-dimensional distributions over protein conformations directly from data. This survey on generative models for protein conformation sampling provides a comprehensive overview of recent advances in this emerging field. We categorize existing models based on generative architecture, structural representation, and target tasks. We also discuss key datasets, evaluation metrics, limitations, and opportunities for integrating physics-based knowledge with data-driven models. By bridging machine learning and structural biology, DGMs are poised to transform our ability to model, design, and understand dynamic protein behavior.
Loading