Score-Based Diffusion Modeling for Nonparametric Empirical Bayes in Heteroscedastic Gaussian Mixtures
Keywords: empirical Bayes, Gaussian location mixture model, g-modeling, heteroscedastic noise, diffusion processes, score matching
TL;DR: We propose score-based diffusion framework with explicit G-modeling, a nonparametric empirical Bayes method that achieves near-parametric score estimation guarantees and state-of-the-art denoising for multivariate heteroscedastic Gaussian mixtures.
Abstract: We propose a generalized score-based diffusion framework for learning multivariate Gaussian mixture models with homoscedastic or heteroscedastic noise. Our goal is to nonparametrically estimate the latent location distribution and denoise the observations.
Departing from the conventional maximum likelihood approach, we reinterpret each observation as a temporal slice of a family of stochastic diffusion processes. This modeling choice enables a principled characterization of the additive noise structure and supports a multi-step denoising procedure grounded in reverse-time dynamics. We introduce a score-based objective that explicitly models the latent distribution and accommodates observation-specific noise covariances.
Theoretically, we establish that the score estimation error with $n$ independent observations achieves a near-parametric error rate of $\frac{\mathrm{polylog}(n)}{n}$,
improving upon existing results in the diffusion literature. Empirically, our method outperforms the nonparametric maximum likelihood estimator in both density estimation and denoising fidelity, especially in high-dimensional settings.
These findings suggest a promising direction for integrating nonparametric empirical Bayes with diffusion-based generative modeling for latent structure recovery.
Primary Area: Probabilistic methods (e.g., variational inference, causal inference, Gaussian processes)
Submission Number: 26547
Loading