Distance Estimation for High-Dimensional Distributions

16 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Sampling, Distribution Testing, High-dimensional statistics
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: We study the distance estimation problem for high-dimensional distributions. Given two distributions $\mathcal{P}$ and $\mathcal{Q}$ over $\{0,1\}^n$, and a parameter $\varepsilon$, the goal of distance estimation is to determine the statistical distance between the two distributions up to an additive tolerance $\pm \varepsilon$. Since exponential lower bounds (in $n$) are known for the problem in the standard sampling model, research has focused on models where one can draw conditional samples. Among these models, \textit{subcube conditioning} ($\mathsf{SUBCOND}$), i.e., conditioning on arbitrary subcubes of the domain, holds the promise of widespread practical adoption owing to its ability to capture the natural behavior of distribution samplers. In this paper, we present the first polynomial sample distance estimator in the conditional sampling model, and our algorithm makes $\tilde{\mathcal{O}}(n^3/\varepsilon^5)$ \subcond queries. We implement our algorithm to estimate the distance between distributions arising from real-life sampling benchmarks, and we find that our algorithm easily scales beyond the naive method.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 651
Loading