Keywords: generative models, diffusion language models
Abstract: Masked Diffusion Models (MDMs) as language models generate by iteratively
unmasking tokens, yet their performance crucially depends on the inference-
time order of unmasking. Prevailing heuristics, such as confidence-based sam-
pling, are myopic: they optimize locally, fail to leverage extra test-time compute,
and let early decoding mistakes cascade. We propose Lookahead Unmasking
(LookUM), which addresses these concerns by reformulating sampling as path
selection over all possible unmasking orders without the need for an external
reward model. Our framework couples (i) a path generator that proposes paths
by sampling from pools of unmasking sets with (ii) a verifier that computes the
uncertainty of the proposed paths and performs importance sampling to subse-
quently select the final paths. Empirically, erroneous unmasking measurably in-
flates sequence-level uncertainty, and our method exploits this to avoid error-prone
trajectories. We validate our framework across six benchmarks, such as mathe-
matics, planning, and coding, and demonstrate consistent performance improve-
ments. LookUM requires only two to three paths to achieve peak performance, demon-
strating remarkably efficient path selection. The consistent improvements on both
LLaDA and post-trained LLaDA 1.5 are particularly striking: base LLaDA with
LookUM rivals the performance of RL-tuned LLaDA 1.5, while LookUM further
enhances LLaDA 1.5 itself—showing that uncertainty-based verification provides
orthogonal benefits to reinforcement learning and underscoring the versatility of
our framework. Code will be publicly released.
Primary Area: generative models
Submission Number: 2623
Loading