Autoregressive Models Enable Efficient Conditional 3D Molecular Generation

Published: 28 May 2026, Last Modified: 28 May 2026GenBio 2026 SpotlightEveryoneRevisionsBibTeXCC BY 4.0
Keywords: generative models, molecules, 3D, autoregressive, conditional generation, fragments
TL;DR: We introduce Symphony++, a small, efficient autoregressive 3D molecule generation model that is extremely capable of fragment-conditional and quantum mechanical property-conditional generation.
Abstract: The reigning paradigm for small molecule 3D structure generation in recent years has been the so-called stochastic interpolant models, which includes the class of diffusion and flow-based generative models. These models learn how to transport samples from an easy-to-sample base distribution (such as a Gaussian distribution defined over $\mathbb{R}^{3N}$, where $N$ is the number of atoms) to the distribution of 3D molecular structures. Critically, the number of atoms $N$ needs to be sampled apriori before the learned transport process, as all atoms are transported simultaneously. This makes such models hard to use in tasks such as fragment completion, where generation must proceed from an incomplete molecule while the remaining number of atoms are unknown. Indeed, most benchmarks for small molecule 3D structure generation simply test unconditional generation, where the goal is simply to sample possible 3D molecular structures without any constraints. Unfortunately, existing metrics overly emphasize exact bond length and angular distribution matching. We argue that the key goal of molecule generation is conditional generation, where we wish to generate molecules conditional on some geometric or chemical constraints. We show that a long-forgotten approach of building molecules autoregressively actually performs favorably in these regimes. In fact, by carefully engineering the simple training recipe proposed in Symphony, we find that autoregressive molecule generative models can learn 1) significantly more efficiently than existing diffusion/flow-based models, 2) enable significantly more accurate conditional generation in terms of quantum mechanical properties, 3) enable simple and efficient fragment completion with high success rates.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 195
Loading