Heavy-Tailed Diffusion with Denoising Levy Probabilistic Models

Dario Shariatian; Umut Simsekli; Alain Oliviero Durmus

Heavy-Tailed Diffusion with Denoising Levy Probabilistic Models

Dario Shariatian, Umut Simsekli, Alain Oliviero Durmus

Published: 22 Jan 2025, Last Modified: 16 May 2025ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: diffusion, generative model, deep learning, machine learning, heavy-tail

TL;DR: The paper introduces the Denoising Lévy Probabilistic Model (DLPM), which replaces Gaussian noise in denoising diffusion probabilistic models (DDPM) with heavy-tailed α-stable noise to improve performance, showing better tail coverage

Abstract: Investigating noise distributions beyond Gaussian in diffusion generative models remains an open challenge. The Gaussian case has been a large success experimentally and theoretically, admitting a unified stochastic differential equation (SDE) framework, encompassing score-based and denoising formulations. Recent studies have investigated the potential of \emph{heavy-tailed} noise distributions to mitigate mode collapse and effectively manage datasets exhibiting class imbalance, heavy tails, or prominent outliers. Very recently, Yoon et al.\ (NeurIPS 2023), presented the Levy-Ito model (LIM), directly extending the SDE-based framework to a class of heavy-tailed SDEs, where the injected noise followed an $\alpha$-stable distribution -- a rich class of heavy-tailed distributions. Despite its theoretical elegance and performance improvements, LIM relies on highly involved mathematical techniques, which may limit its accessibility and hinder its broader adoption and further development. In this study, we take a step back, and instead of starting from the SDE formulation, we extend the denoising diffusion probabilistic model (DDPM) by directly replacing the Gaussian noise with $\alpha$-stable noise. By using only elementary proof techniques, we show that the proposed approach, \emph{denoising L\'{e}vy probabilistic model} (DLPM) algorithmically boils down to running vanilla DDPM with minor modifications, hence allowing the use of existing implementations with minimal changes. Remarkably, as opposed to the Gaussian case, DLPM and LIM yield different training algorithms and different backward processes, leading to distinct sampling algorithms. This fundamental difference translates favorably for the performance of DLPM in various aspects: our experiments show that DLPM achieves better coverage of the tails of the data distribution, better generation of unbalanced datasets, and improved computation times requiring significantly smaller number of backward steps.

Primary Area: generative models

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 2369

Loading