It Takes Two to Tango: Directly Optimizing for Constrained Synthesizability in Generative Molecular Design

Published: 03 Mar 2025, Last Modified: 09 Apr 2025AI4MAT-ICLR-2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Submission Track: Full Paper
Submission Category: AI-Guided Design
Keywords: generative design, constrained synthesizability, drug discovery
TL;DR: generate synthesizable molecules that can be synthesized with pre-defined building blocks
Abstract: Constrained synthesizability is an unaddressed challenge in generative molecular design. In particular, designing molecules satisfying multi-parameter optimization objectives, while simultaneously being synthesizable *and* enforcing the presence of specific building blocks in the synthesis. This is practically important for molecule re-purposing, sustainability, and efficiency. In this work, we propose a novel reward function called **TANimoto Group Overlap (TANGO)**, which uses chemistry principles to transform a sparse reward function into a *dense* reward function -- crucial for reinforcement learning (RL). TANGO can augment molecular generative models to *directly* optimize for constrained synthesizability while simultaneously optimizing for other properties relevant to drug discovery. Our framework is general and addresses starting-material, intermediate, and divergent synthesis constraints. Contrary to many existing works in the field, we show that *incentivizing* a general-purpose model with RL is a productive approach to navigating challenging synthesizability optimization scenarios. We demonstrate this by showing that the trained models explicitly learn a desirable distribution. Our framework is the first *generative* approach to successfully address constrained synthesizability.
Submission Number: 7
Loading