Finish First, Perfect Later: Test-Time Token-Level Cross-Validation for Diffusion Large Language Models
Keywords: Diffusion Language Model, Decoding Strategy
TL;DR: We propose a new decoding method for diffusion LLMs to enhance error correction.
Abstract: Diffusion large language models (dLLMs) have recently emerged as a promising alternative to autoregressive LLMs, offering accelerated parallel decoding and improved global context modeling through bidirectional attention. However, vanilla decoding strategies in dLLMs suffer from a critical limitation: once accepted, a token cannot be revised in subsequent steps. As a result, early mistakes persist across iterations, harming both intermediate predictions and final output quality. To address this issue, we propose Tolerator (Token-Level Cross-Validation Refinement), a training-free decoding strategy that leverages cross-validation among predicted tokens. While many existing methods follow a single progressive unmasking design, Tolerator introduces a two-stage process: (i) sequence fill-up and (ii) iterative refinement with resampling of a subset of tokens while the remaining tokens provide context. This design allows previously accepted tokens to be reconsidered and corrected when necessary, leading to more reliable diffusion decoding outputs. We evaluate Tolerator on five standard benchmarks for language understanding, code generation, and mathematical reasoning. Empirically, our method achieves consistent improvements over baselines under the same computational budget. These findings suggest that decoding algorithms are crucial to realizing the full potential of diffusion large language models.
Primary Area: generative models
Submission Number: 23080
Loading