Re-evaluating Confidence Remasking in Masked Diffusion Language Models

Stipe Frkovic; Metod Jazbec; Dan Zhang; Christian A. Naesseth; Ilija Bogunovic; Eric Nalisnick

Re-evaluating Confidence Remasking in Masked Diffusion Language Models

Stipe Frkovic, Metod Jazbec, Dan Zhang, Christian A. Naesseth, Ilija Bogunovic, Eric Nalisnick

Published: 30 May 2026, Last Modified: 01 Jun 2026SPIGM @ ICML PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: dLLM, LLM, unmasking, remasking

TL;DR: We re-evaluate recently proposed post-hoc confidence-based remasking in masked dLLMs, providing new insights into its limitations and promises.

Abstract: Masked diffusion language models (dLLMs) have recently emerged as a competitive alternative to autoregressive language models, with the promise of faster inference via parallel token generation. A notable limitation of the masked formulation, however, is that once a token has been unmasked it can no longer be revised, leaving dLLMs vulnerable to early sampling mistakes. To address this, a growing body of work has sought to extend masked dLLMs with self-correcting (remasking) capabilities. One appealing subset of these methods does so in a training-free, post-hoc manner based on token confidence, with encouraging early reported results. In this work, we revisit the empirical evaluation of a representative post-hoc remasking method, WINO, and find that under standard decoding settings (shorter block lengths) it brings little-to-no benefit over confidence-based unmasking alone. Extending the evaluation to non-greedy decoding, we find that while confidence-based remasking can mitigate errors introduced by increased stochasticity to some extent, it also exacerbates the diversity collapse previously reported for confidence-based unmasking. Overall, our results show that the benefits of post-hoc confidence-based remasking are highly setting-dependent, underscoring the need for a more comprehensive evaluation framework.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 103

Loading