Keywords: discrete diffusion, test-time scaling, noise optimization, uniform discrete diffusion
TL;DR: Discrete Diffusion Noise Optimization optimizes the initial uniform discrete sequence for reward-aligned generation at test-time.
Abstract: Aligning discrete diffusion models with downstream rewards remains challenging:
step-wise guidance is myopic and degrades sample quality, while fine-tuning is
expensive and task-specific. We introduce Discrete Diffusion Noise Optimization
(DDNO), a training-free method that instead optimizes the initial discrete noise to
maximize terminal rewards while keeping the generator frozen. DDNO parameter-
izes the noise distribution with continuous logits and propagates gradients through
the reverse process via a straight-through surrogate combined with soft mixing,
enabling stable optimization over long denoising trajectories. On compositional
text-to-image synthesis and controllable text generation, DDNO consistently out-
performs inference-time baselines like guidance and Best-of-N while exhibiting
favorable scaling. This positions DDNO as a promising axis for test-time scaling
in discrete generative models, complementing advances in continuous diffusion.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 110
Loading