Abstract: The paradigm shift toward structure-driven molecule generation has been propelled by advances in deep generative models, such as variational auto-encoders and diffusion models. However, these generative models for molecular design remain constrained by exposure bias, error accumulation, and suboptimal handling of activity cliffs. Here, we introduce DiffGap, a diffusion-based framework that integrates adaptive sampling and pseudo-molecule estimation to bridge the gap between training objectives and inference dynamics in 3D molecule generation. By dynamically aligning intermediate denoising steps with realistic generation trajectories, DiffGap enables the diffusion model to adapt to input biases in advance during the training phase. A temperature annealing module further controls the aligning strength of the adaptive alignment process, ensuring stable learning of the data distribution. Evaluated on the CrossDocked2020 benchmark, DiffGap outperforms existing methods in docking scores and binding affinity, demonstrating superior fidelity in generating drug-like molecules. Our work establishes a principled approach to harmonize generative training with inference mechanics, offering a robust computational toolkit for accelerating structure-based therapeutic discovery. The source code of DiffGap will be published after review.
Submission Number: 281
Loading