Denoising is not the End: Discrete Diffusion Language Models with Self-Correction

Published: 02 Mar 2026, Last Modified: 18 Mar 2026LIT Workshop @ ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Track: long paper (up to 10 pages)
Keywords: Masked Diffusion Model, Self-Correction
Abstract: Discrete Diffusion Models have emerged as an effective approach to text generation, providing an alternative to autoregressive models. However, the generated texts often suffer from quality issues, including grammatical errors, lack of fluency and factual inaccuracies. This paper systematically evaluates different self-correction strategies for hybrid-noise (masking and uniform) discrete diffusion language models. The outputs of the base model are compared with self-corrected outputs under different strategies, with improvements measured in terms of quality, fluency, and fidelity. Our experiments show significant quality gains from iterative self-correction, with improvements in grammar, factuality, clarity, and creativity reaching double-digit percentages. We also observe a tradeoff between text quality and content preservation and identify optimal configurations that achieve significant quality enhancements while maintaining high fidelity to the original output. Finally, we find that it is optimal to allocate inference compute to denoising and self-correction in approximately equal proportions.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 37
Loading