Keywords: molecule, generation, flowmatching, diffusion, chemistry, synthesizable
TL;DR: multimodal model generates synthesizable molecules with their 3D coordinates, new SOTA across metrics and can perform conditional generation such as pharmacophore conditioning
Abstract: Ensuring synthesizability in generative small molecule design remains a major challenge. While recent developments in synthesizable molecule generation have demonstrated promising results, these efforts have been largely confined to 2D molecular graph representations, limiting the ability to perform geometry-based conditional generation. In this work, we present SYNCOGEN (Synthesizable Co-Generation), a single framework that combines simultaneous masked graph diffusion and flow matching for synthesizable 3D molecule generation. SYNCOGEN samples from the joint distribution of molecular building blocks, chemical reactions, and atomic coordinates. To train the model, we curated SYNSPACE, a dataset containing over 600K synthesis-aware building block graphs and 3.3M conformers. SYNCOGEN achieves state-of-the-art performance in unconditional small molecule graph and conformer generation, and the model delivers competitive performance in zero-shot molecular linker design and pharmacophore conditioning for protein ligand generation in drug discovery. Overall, this multimodal formulation represents a foundation for future applications enabled by non-autoregressive molecular generation, including analog expansion, lead optimization, and direct structure conditioning.
Primary Area: generative models
Submission Number: 21090
Loading