Track: Type A (Regular Papers)
Keywords: Diffusion, Natural Language Processing, Artificial Intelligence, Genomic sequence modeling
Abstract: While diffusion models have achieved state-of-the-art results in continuous domains like image generation, their application to inherently discrete data such as natural language and DNA presents unique challenges. Continuous-space adaptations often introduce artifacts and complexities, motivating a focused investigation into models that operate directly on discrete data. This survey provides a comprehensive overview of the methods and advancements in the field of discrete diffusion models. We review the foundational formulations, including Denoising Diffusion Probabilistic Models (DDPMs) and Score-Based Generative Models (SGMs), and their theoretical adaptations to discrete state spaces. We then chronologically survey advancements across key modalities—Natural Language Processing and genomic sequences—examining critical research topics such as novel forward processes and the adaptation of pre-trained language models. By synthesizing these developments and outlining future research directions, this paper offers a structured overview to this rapidly evolving field.
Serve As Reviewer: ~Ruurd_Jan_Anthonius_Kuiper1
Submission Number: 63
Loading