Keywords: We introduce a learned remasking scheduler for discrete diffusion LLMs that improves performance and inference efficiency by enabling dynamic compute and reasoning allocation.
TL;DR: Diffusion LLMs, Reinforcement Learning, Efficiency, Adaptive Compute
Abstract: Discrete diffusion language models (DLLMs) have emerged as a new paradigm of language modeling that offers improved inference efficiency and nonlinear generation and reasoning. While standard methods rely on fixed or heuristic schedules (e.g., random or confidence-based), we present LeADS, a framework that enables dynamic inference-time control for DLLMs with a learned remasking scheduler optimized for downstream performance. LeADS chooses what tokens are denoised at each diffusion step based on the internal representations of the model and dynamically allocates compute for token efficiency. On mathematical reasoning tasks, LeADS achieves 19.2% relative improvement (12 pp) over low-confidence based denoising schedules and reduces required diffusion steps by up to 15.3%.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 75
Loading