Plan for Speed: Dilated Scheduling for Masked Diffusion Language Models

Plan for Speed: Dilated Scheduling for Masked Diffusion Language Models

ICLR 2026 Conference Submission115 Authors

01 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Discrete Diffusion, Information Theory

TL;DR: Dilated Unmasking Scheduler (DUS): an inference-only, model-agnostic planner that parallelizes unmasking in diffusion LMs to improve the speed-quality tradeoff.

Abstract: Masked diffusion language models (MDLMs) promise fast, non-autoregressive text generation, yet existing samplers, which pick tokens to unmask based on model confidence, ignore interactions when unmasking multiple positions in parallel and effectively reduce to slow, autoregressive behavior. We propose the Dilated Unmasking Scheduler (DUS), an inference-only, planner-model-free method that partitions sequence positions into non-adjacent dilated groups and unmasked them in parallel so as to minimize an upper bound on joint entropy gain at each denoising step. By explicitly trading off the number of network calls against generation quality, DUS recovers most of the performance lost under traditional parallel unmasking strategies. Across math (GSM8K, MATH500), code (HumanEval, MBPP) and general‐knowledge benchmarks (BBH, MMLU-Pro), DUS outperforms confidence‐based planners, without modifying the underlying denoiser, and reveals the true speed-quality frontier of MDLMs.

Supplementary Material: zip

Primary Area: generative models

Submission Number: 115

Loading