Distance Estimation for High-Dimensional Discrete Distributions

Published: 22 Jan 2025, Last Modified: 10 Mar 2025AISTATS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Given two distributions $\mathcal{P}$ and $\mathcal{Q}$ over a high-dimensional domain $\{0,1\}^n$, and a parameter $\varepsilon$, the goal of distance estimation is to determine the statistical distance between $\mathcal{P}$ and $\mathcal{Q}$, up to an additive tolerance $\pm \varepsilon$. Since exponential lower bounds (in $n$) are known for the problem in the standard sampling model, research has focused on richer query models where one can draw conditional samples. This paper presents the first polynomial query distance estimator in the conditional sampling model ($\mathsf{COND}$). We base our algorithm on the relatively weaker \textit{subcube conditional} sampling ($\mathsf{SUBCOND}$) oracle, which draws samples from the distribution conditioned on some of the dimensions. $\mathsf{SUBCOND}$ is a promising model for widespread practical use because it captures the natural behavior of discrete samplers. Our algorithm makes $\tilde{\mathcal{O}}(n^3/\varepsilon^5)$ queries to $\mathsf{SUBCOND}$.
Submission Number: 331
Loading