Keywords: poisson diffusion, information-theoretic framework, discrete modeling, negative log likelihood
TL;DR: This paper introduces ItDPDM, a Poisson-based diffusion model that directly models discrete data without relying on continuous embeddings or variational approximations, achieving improved likelihood estimation on music and image benchmarks.
Abstract: Generative modeling of non-negative, discrete data, such as symbolic music, remains challenging due to two persistent limitations in existing methods. Firstly, many approaches rely on modeling continuous embeddings, which is suboptimal for inherently discrete data distributions. Secondly, most models optimize variational bounds rather than exact data likelihood, resulting in inaccurate likelihood estimates and degraded sampling quality. While recent diffusion-based models have addressed these issues separately, we tackle them jointly. In this work, we introduce the Information-Theoretic Discrete Poisson Diffusion Model (ItDPDM), inspired by photon arrival process, which combines exact likelihood estimation with fully discrete-state modeling. Central to our approach is an information-theoretic Poisson Reconstruction Loss (PRL) that has a provable exact relationship with the true data likelihood. ItDPDM achieves improved likelihood and sampling performance over prior discrete and continuous diffusion models on a variety of synthetic discrete datasets. Furthermore, on real-world datasets such as symbolic music and images, ItDPDM attains superior likelihood estimates and competitive generation quality—demonstrating a proof of concept for distribution-robust discrete generative modeling.
Supplementary Material: zip
Primary Area: Other (please use sparingly, only use the keyword field for more details)
Submission Number: 25853
Loading