FlowBack: A Flow-matching Approach for Generative Backmapping of Macromolecules

Published: 17 Jun 2024, Last Modified: 16 Jul 2024ML4LMS OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Backmapping, Generative, Flow-matching, Transferable, DNA-protein
Abstract: Coarse-grained models have become ubiquitous in biomolecular modeling tasks aimed at studying slow dynamical processes such as protein folding and DNA hybridization. Although these models considerably accelerate sampling, it remains challenging to recover an ensemble of all-atom structures corresponding to coarse-grained simulations. In this work, we introduce a generative approach called FlowBack that uses a flow-matching objective to map samples from a coarse-grained prior distribution to an all-atom data distribution. We construct our prior distribution to be amenable to any coarse-grained map and any type of macromolecule, and we find that generated structures are more robust and contain less steric clashes than those generated by previous approaches. We train a protein-specific model on structures from the Protein Data Bank which achieve state-of-the-art results on bond quality on clash score. Furthermore, we train a model on DNA-protein data which achieves excellent reconstruction and generative capabilities on complexes from the PDB as well as on coarse-grained simulations of DNA-protein binding.
Supplementary Material: pdf
Poster: pdf
Submission Number: 21
Loading