Abstract: In this work, we provide a systematic survey of Discrete Diffusion Language Models (dLLMs) and Discrete Diffusion Multimodal Language Models (dMLLMs). Unlike autoregressive (AR) models, dLLMs and dMLLMs adopt a multi-token, parallel decoding paradigm using full attention and a denoising-based generation strategy. This paradigm naturally enables parallel generation, fine-grained output control, and dynamic perception. These capabilities are previously difficult to achieve with AR models. A growing number of industrial-scale proprietary d(M)LLMs, as well as a large number of open-source academic d(M)LLMs, have demonstrated performance comparable to their autoregressive counterparts, while achieving up to 10$\times$ acceleration in inference speed. These developments position discrete diffusion models as a promising alternative to intelligence based on the traditional autoregressive approach. In this work, we present a comprehensive overview of the research in the dLLM and dMLLM domains. We trace the historical development of dLLMs and dMLLMs, formalize the underlying mathematical frameworks, list commonly-used modeling methods, and categorize representative models. We further analyze key techniques for training, inference, quantization. We also discuss the trustworthy issues and summarize emerging applications across language, vision-language, and biological domains and etc.. We conclude by discussing future directions for research and deployment.
Submission Type: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=6yIY8VNWor
Changes Since Last Submission: Dear Associate Editor and Reviewers,
Thank you very much for organizing the review process and for your time and effort in handling our submission.
In our previous submission, the manuscript was desk-rejected due to remaining formatting differences, particularly the reduced font size of the references. To address this issue, we removed the `\footnotesize` command that caused the reference font to shrink. In addition, we removed all `\vspace` commands from figures and tables so that their spacing strictly follows the TMLR template.
We sincerely hope that the revised manuscript now fully complies with the formatting requirements. Thank you again for your consideration and support.
Kind regards,
The Authors
Assigned Action Editor: ~Shuangfei_Zhai3
Submission Number: 7141
Loading