Hit Expansion via Localized Exploration of Synthesizable Chemical Space

Published: 02 Mar 2026, Last Modified: 05 Mar 2026GEM 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: drug discovery, GFlowNets, chemistry, machine learning, generative modeling, molecular optimization, Bayesian optimization, active learning
TL;DR: We propose a template-based GFlowNet framework which optimizes hit molecules by exploring local regions of chemical space.
Abstract: Generative models for drug design which directly produce synthetic pathways have gained significant popularity due to their ability to constrain the search space to synthetically accessible molecules. However, existing methods have focused primarily on *de novo* molecular design, and rarely start the generation process from known binders. In this paper, we present HELiX: a template-based GFlowNet framework for localized exploration of chemical space. HELiX first learns to deconstruct a given hit by reversing selected reaction steps, and then performs forward synthesis in a manner that preserves synthetic tractability. Our approach demonstrates strong performance in efficiently identifying diverse, high-scoring analogs of known binders, and addresses the challenge of sample efficiency in GFlowNets by incorporating a Bayesian optimization loop which effectively balances exploration and exploitation. We also show that local exploration is inherently robust to noisy oracle evaluations, a common problem in drug development when using *in silico* predictors of binding affinity.
Presenter: ~Walter_Virany1
Format: Maybe: the presenting author will attend in person, contingent on other factors that still need to be determined (e.g., visa, funding).
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Submission Number: 73
Loading