Lookahead Unmasking Elicits Reliable Decoding in Diffusion Language Models

06 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: generative models, diffusion language models
Abstract: Masked Diffusion Models (MDMs) as language models generate by iteratively unmasking tokens, yet their performance crucially depends on the inference- time order of unmasking. Prevailing heuristics, such as confidence-based sam- pling, are myopic: they optimize locally, fail to leverage extra test-time compute, and let early decoding mistakes cascade. We propose Lookahead Unmasking (LookUM), which addresses these concerns by reformulating sampling as path selection over all possible unmasking orders without the need for an external reward model. Our framework couples (i) a path generator that proposes paths by sampling from pools of unmasking sets with (ii) a verifier that computes the uncertainty of the proposed paths and performs importance sampling to subse- quently select the final paths. Empirically, erroneous unmasking measurably in- flates sequence-level uncertainty, and our method exploits this to avoid error-prone trajectories. We validate our framework across six benchmarks, such as mathe- matics, planning, and coding, and demonstrate consistent performance improve- ments. LookUM requires only two to three paths to achieve peak performance, demon- strating remarkably efficient path selection. The consistent improvements on both LLaDA and post-trained LLaDA 1.5 are particularly striking: base LLaDA with LookUM rivals the performance of RL-tuned LLaDA 1.5, while LookUM further enhances LLaDA 1.5 itself—showing that uncertainty-based verification provides orthogonal benefits to reinforcement learning and underscoring the versatility of our framework. Code will be publicly released.
Primary Area: generative models
Submission Number: 2623
Loading