The Cosine Schedule is Fisher-Rao-Optimal for Masked Discrete Diffusion Models

NeurIPS 2025 Workshop NeurReps Submission56 Authors

28 Aug 2025 (modified: 29 Oct 2025)Submitted to NeurReps 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Discrete Diffusion Models, Information Geometry, Sampling
TL;DR: The Cosine Schedule is Fisher-Rao-Optimal for Masked Discrete Diffusion Models
Abstract: In this work, we study the problem of choosing the discretisation schedule for sampling from masked discrete diffusion models in terms of the information geometry of the induced probability path. Specifically, we show that the optimal schedule under the Fisher-Rao geometry recovers the popularly-used cosine schedule.
Submission Number: 56
Loading