SPMDM: Enhancing Masked Diffusion Models through Simplifing Sampling Path

Yichen Zhu; Weiyu Chen; James Kwok; Zhou Zhao

SPMDM: Enhancing Masked Diffusion Models through Simplifing Sampling Path

Yichen Zhu, Weiyu Chen, James Kwok, Zhou Zhao

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: discrete diffusion, generative modeling

Abstract: Autoregressive models (ARMs) show strong capabilities in many domains but face challenges with planning and complex reasoning due to their sequential generation. Masked diffusion models (MDMs) address these issues by enabling controllable, any-order, and parallel generation but encounter training difficulties as token prediction complexity varies with unmasked token positions. This work identifies two key characteristics of efficient MDM sampling paths: prioritizing tokens near unmasked ones and generating subsequence earlier in reasoning. We propose the Simple Path Masked Diffusion Model (SPMDM), which partitions sequences into fixed-length, non-overlapping subsequences and applies varying noise scales to learn token-level and cross-subsequence dependencies. Experiments on synthetic data and tasks like Countdown and Sudoku show SPMDM captures structural rules effectively, significantly outperforming existing MDMs and ARMs, with competitive results on broader reasoning benchmarks.

Supplementary Material: zip

Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)

Submission Number: 19272

Loading