RETRO SYNFLOW: Discrete Flow-Matching for Accurate and Diverse Single-Step Retrosynthesis

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: retrosynthesis, flow matching, discrete data
Abstract: A fundamental challenge in organic chemistry is identifying and predicting the sequence of reactions that synthesize a desired target molecule. Due to the combinatorial nature of the chemical search space, single-step reactant prediction—i.e., single-step retrosynthesis—remains difficult, even for state-of-the-art template-free generative methods. These models often struggle to produce an accurate yet diverse set of feasible reactions in a chemically rational manner. In this paper, we propose RETRO SYNFLOW (RSF), a discrete flow-matching framework that formulates single-step retrosynthesis as a Markov bridge between a given product molecule and its corresponding reactants. Unlike prior approaches, RSF introduces a reaction center identification step to extract intermediate structures, or synthons, which serve as a more informative and structured source distribution for the discrete flow model. To further improve the diversity and chemical feasibility of generated samples, RSF incorporates Feynman-Kac (FK) steering with Sequential Monte Carlo (SMC) resampling at inference time. This approach leverages a learned forward-synthesis reward oracle to guide the generation process toward more promising reactant candidates. Empirically, RSF substantially outperforms the previous state-of-the-art methods in top-1 accuracy. In addition, FK-steering significantly improves round-trip accuracy, demonstrating stronger chemical validity and synthetic feasibility, all while maintaining competitive top-k performance. These results establish RSF as a new leading approach for single-step retrosynthesis prediction.
Supplementary Material: zip
Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)
Submission Number: 6966
Loading