Keywords: Constrained Molecular Generation, Discrete Diffusion, constrained optimization, molecular design, safety-critical generation
Abstract: Discrete diffusion models are a class of generative models that construct sequences by progressively denoising samples from a categorical noise distribution. In life science setting, such as molecular strings (SMILES) and other biological sequence design settings,
these models have emerged as a promising alternative to autoregressive architectures, presenting an opportunity to enforce sequence-level constraints, a capability that existing left-to-right sequence design cannot natively provide. This paper capitalizes on this opportunity by introducing $\textbf{Constrained Discrete Diffusion}$ (CDD), a novel integration of differentiable constraint optimization within the diffusion process to ensure adherence to biosafety policies and design properties during generation. Unlike conventional generators that often rely on post-hoc filtering or model retraining for controllable generation, CDD directly imposes constraints into the discrete diffusion sampling process, resulting in a training-free and effective approach. Experiments in property adherence molecular design, toxicity-bounded generation, and novelty enforcement demonstrate that CDD achieves $\textbf{zero constraint violations}$ in a diverse array of tasks outperforming auto-regressive and existing discrete diffusion approaches.
Submission Number: 28
Loading