Keywords: Molecule Generation, Diffusion Model
TL;DR: S$^2$-HDM is a hierarchical diffusion model that jointly optimizes scaffolds and substituents with differentiated noise schedules, enabling end-to-end generation of molecules that balance scaffold integrity with substituent diversity.
Abstract: Deep generative models have emerged as powerful tools for efficiently navigating the vast chemical space and generating molecules with desirable properties. However, existing approaches—particularly diffusion-based models—struggle to effectively model the hierarchical structure of drug-like molecules, which typically consist of a core scaffold and attached substituent functional groups. This hierarchical decomposition is central to modern drug design strategies, where scaffold hopping and lead optimization are applied iteratively to refine molecular structure. While traditional methods can optimize each component separately, they often rely on rule-based heuristics and lack the capacity for joint optimization. To address these limitations, we propose the Scaffold–Substituent Hierarchical Diffusion Model (S$^2$-HDM). It unifies the principles of scaffold hopping and lead optimization within a single generative framework by introducing a differentiated noise schedule for scaffold and substituent atoms. % In the reverse process, scaffold atoms are prioritized for early denoising, providing a stable core context that guides the flexible generation of diverse substituents.
Unlike traditional approaches, S$^2$-HDM implicitly learns the scaffold and substituent hierarchy without pre-defined functional groups, enabling an end-to-end generation pipeline. We validate the effectiveness of our method through extensive experiments, where S$^2$-HDM achieves outstanding performance in multiple generation benchmarks. These results underscore the model’s potential to advance drug design by balancing scaffold integrity with substituent diversity, aligning closely with structure-based design principles. The code can be found at \href{https://anonymous.4open.science/r/S2-HDM-6F23}{https://anonymous.4open.science/r/S2-HDM-6F23}.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 14990
Loading