Keywords: Protein Conformation, Generative Modeling, Molecular Dynamics, Diffusion Models, Language Models
TL;DR: An autoregressive model that simultaneously learns both the conformations and dynamics of proteins from molecular dynamics data
Abstract: Understanding protein dynamics is critical for elucidating their biological functions.
The increasing availability of molecular dynamics (MD) data enables the training of deep generative models to efficiently explore the conformational space of proteins.
However, existing approaches either fail to explicitly capture the temporal dependencies between conformations or do not support direct generation of time-independent samples.
To address these limitations, we introduce *ConfRover*, an autoregressive model that simultaneously learns protein conformation and dynamics from MD trajectory data, supporting both time-dependent and time-independent sampling.
At the core of our model is a modular architecture comprising: (i) an *encoding layer*, adapted from protein folding models, that embeds protein-specific information and conformation at each time frame into a latent space; (ii) a *temporal module*, a sequence model that captures conformational dynamics across frames; and (iii) an SE(3) diffusion model as the *structure decoder*, generating conformations in continuous space.
Experiments on ATLAS, a large-scale protein MD dataset of diverse structures, demonstrate the effectiveness of our model in learning conformational dynamics and supporting a wide range of downstream tasks.
*ConfRover* is the first model to sample both protein conformations and trajectories within a single framework, offering a novel and flexible approach for learning from protein MD data.
Primary Area: Machine learning for sciences (e.g. climate, health, life sciences, physics, social sciences)
Submission Number: 18637
Loading