TL;DR: GFlowNets meet flow matching for 3D molecules generation via synthesis pathways.
Abstract: Many generative applications, such as synthesis-based 3D molecular design, involve constructing compositional objects with continuous features.
Here, we introduce Compositional Generative Flows (CGFlow), a novel framework that extends flow matching to generate objects in compositional steps while modeling continuous states.
Our key insight is that modeling compositional state transitions can be formulated as a straightforward extension of the flow matching interpolation process.
We further build upon the theoretical foundations of generative flow networks (GFlowNets), enabling reward-guided sampling of compositional structures.
We apply CGFlow to synthesizable drug design by jointly designing the molecule's synthetic pathway with its 3D binding pose.
Our approach achieves state-of-the-art binding affinity and synthesizability on all 15 targets from the LIT-PCBA benchmark, and 4.2x improvement in sampling efficiency compared to 2D synthesis-based baseline.
To our best knowledge, our method is also the first to achieve state of-art-performance in both Vina Dock (-9.42) and AiZynth success rate (36.1\%) on the CrossDocked2020 benchmark.
Lay Summary: Developing new drugs means discovering molecules that not only work well but can also be realistically synthesized in the wet-lab. Current AI methods often fall short: either they design molecules that are hard to synthesize, or they don't fully consider how the molecule interacts with the target protein.
Our new method, CGFlow, is like a smart architect for molecules. It designs the ligand binder within target protein binding site directly while simultaneously planning its step-by-step "recipe" (synthesis pathway).
We've used CGFlow to create 3DSynthFlow, a system that excels at drug design. In multiple benchmark tests, 3DSynthFlow effectively discovered molecules better binding to targets compared to baseline models. Crucially, our approach ensures these molecules are synthesizable, overcoming a major hurdle in drug discovery. This co-design process could significantly accelerate the structure-based drug design.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Primary Area: Applications->Chemistry, Physics, and Earth Sciences
Keywords: drug discovery, synthesizable molecular design, GFlowNets, flow matching
Submission Number: 14914
Loading