Can We Estimate The Entropy Of Arbitrary Distributions Known Up To A Normalization Constant? A Tale of Stein Variational Gradient Descent Scalability
Keywords: Stein Variational Gradient Descent, Sampling, Variational Inference, Entropy
Abstract: Computing the differential entropy for distributions known up to a normalization constant is a challenging problem with significant theoretical and practical applications. Variational inference is widely used for scalable approximation of densities from samples, but is under-explored when only unnormalized densities are available. This setup is more challenging as it requires variational distributions that (1) leverage the unnormalized density, (2) are expressive enough to capture complex target distributions, (3) are computationally efficient, and (4) facilitate easy sampling. To address this, Messaoud et al. [2024] introduced P-SVGD, a particle-based variational method using Stein Variational Gradient Descent. However, we show that P-SVGD scales poorly to high-dimensional spaces. We propose MET-SVGD, an extension of P-SVGD that scales efficiently with convergence guarantees. MET-SVGD incorporates (1) a sufficient condition for SVGD invertability, (1) optimized parameterizations of SVGD updates, (2) a Metropolis-Hastings acceptance step for asymptotic convergence guarantees and enhanced expressivity, and (3) a correction term for better scalability. Our method bridges the gap between Metropolis-Hastings, particle-based sampling and parametrized variational inference, achieving SOTA results on scaling SVGD. We significantly outperform P-SVGD on entropy estimation, Maximum Entropy Reinforcement Learning, and image generation with Energy-Based Models benchmarks. Also, we will release an open-source MET-SVGD library (https://tinyurl.com/2esyfx8j).
Submission Number: 92
Loading