Improved Sampling from Masked Diffusion Models with Position Contrastive Guidance

Dhruvesh Patel; Tahira Naseem; Gaurav Pandey; Md Arafat Sultan; Andrew McCallum; Ramón Fernandez Astudillo

Improved Sampling from Masked Diffusion Models with Position Contrastive Guidance

Dhruvesh Patel, Tahira Naseem, Gaurav Pandey, Md Arafat Sultan, Andrew McCallum, Ramón Fernandez Astudillo

Published: 23 Sept 2025, Last Modified: 23 Dec 2025SPIGM @ NeurIPSEveryoneRevisionsBibTeXCC BY 4.0

Keywords: discrete diffusion models, mask diffusion models

TL;DR: A novel classifier-free guidance based approach for improving sampling from masked diffusion models

Abstract: Masked Diffusion Models (MDMs), which generate multiple tokens at a time, hold the promise of accelerating text generation. However, the performance of MDMs is sensitive to the order in which the tokens are generated. We observe that the MDMs are overconfident about the masked positions on the extreme ends of the output sequence. MDMs also express uncertainty by producing similar probability scores for tokens regardless of the query position. Utilizing these insights, we propose Position Contrastive Guidance, which has two components, a soft order bias that favors left-to-right decoding, and a novel classifier free-guidance that renormalizes the probabilities using position uncertainty to generate more informative tokens earlier in the generation. Our approach can be easily plugged into any existing uncertainty-guided sampling strategy. Experiments on GSM8k, MATH500, and HumanEval show that PCG improves both accuracy as well as throughput for the base as well as instruct version of DREAM-7B and LLaDA-8B models. We also present ablations to identify the contribution of each of proposed components.

Submission Number: 122

Loading