Keywords: Diffusion Large Language Models
TL;DR: We propose a simple yet effective self-evaluation confidence quantification method for diffusion large language models (dLLMs), and introduce a flexible-length dLLM generation framework based on it.
Abstract: Diffusion large language models (dLLMs) have recently attracted significant attention for their ability to enhance diversity, controllability, and parallelism. However, their non-sequential, bidirectionally masked generation makes quality assessment difficult, underscoring the need for effective self-evaluation. In this work, we propose DiSE, a simple yet effective self-evaluation confidence quantification method for dLLMs. DiSE quantifies confidence by computing the probability of regenerating the tokens in the entire generated sequence, given the full context. This method enables more efficient and reliable quality assessment by leveraging token regeneration probabilities, facilitating both likelihood estimation and robust uncertainty quantification. Building upon DiSE, we further introduce a flexible-length generation framework, which adaptively controls the sequence length based on the model’s self-assessment of its own output. Experiments demonstrate that DiSE consistently improves performance across multiple datasets, increasing likelihood evaluation by $4.0$\% and uncertainty evaluation by $6.4$\%, while achieving up to a $32\times$ speedup over Monte Carlo simulation baseline, and additionally improving flexible-length generation accuracy. These results establish DiSE as an efficient and versatile self-evaluation framework for diffusion-based language models.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 2115
Loading