Can We Estimate The Entropy Of Arbitrary Distributions Known Up To A Normalization Constant?

Published: 22 Sept 2025, Last Modified: 01 Dec 2025NeurIPS 2025 WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Stein Variational Gradient Descent, Sampling, Variational Inference, Entropy
TL;DR: We propose a variational method to estimate the entropy of distributions known-up-to a normalization constant.
Abstract: Computing the differential entropy for distributions known up to a normalization constant is a challenging problem with significant theoretical and practical applications. Variational inference is widely used for scalable approximation of densities from samples, but is under-explored when *only unnormalized densities are available*. This setup is more challenging as it requires variational distributions that (1) leverage the unnormalized density, (2) are expressive enough to capture complex target distributions, (3) are computationally efficient, and (4) facilitate easy sampling. To address this, Messaoud et al. [2024] introduced **P-SVGD**, a particle-based variational method using Stein Variational Gradient Descent. However, we show that **P-SVGD** scales poorly to high-dimensional spaces. We propose **MET-SVGD**, an extension of **P-SVGD** that scales efficiently with convergence guarantees. **MET-SVGD** incorporates (1) a sufficient condition for SVGD invertability, (1) optimized parameterizations of SVGD updates, (2) a Metropolis-Hastings acceptance step for asymptotic convergence guarantees and enhanced expressivity, and (3) a correction term for better scalability. Our method bridges the gap between Metropolis-Hastings, particle-based sampling and parametrized variational inference, achieving SOTA results on scaling SVGD. We significantly outperform **P-SVGD** on entropy estimation, Maximum Entropy Reinforcement Learning, and image generation with Energy-Based Models benchmarks. Also, we will release an open-source **MET-SVGD** library (https://tinyurl.com/2esyfx8j).
Submission Number: 111
Loading