Minimax Lower Bounds for Estimating Distributions on Low-dimensional Spaces

TMLR Paper3652 Authors

08 Nov 2024 (modified: 16 Nov 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Recent statistical analyses of Generative Adversarial Networks (GAN) suggest that the error in estimating the target distribution in terms of the $\beta$-H\"{o}lder Integral Probability Metric (IPM) scales as $\mathcal{O}\left(n^{-\frac{\beta}{\overline{d}_{\mathbb{M}}+\delta}} \vee n^{-1/2} \log n \right)$. Here $\overline{d}_{\fM}$ is the upper Minkowski dimension of the corresponding support $\mathbb{M}$ of the data distribution and $\delta$ is a positive constant. It is, however, unknown as to whether this rate is minimax optimal, i.e. whether there are estimators that achieve a better test-error rate. In this paper, we show that the minimax rate for estimating unknown distributions in the $\beta$-H\"{o}lder IPM on $\mathbb{M}$ scales as $\Omega\left(n^{-\frac{\beta}{\underline{d}_{\mathbb{M}}-\delta}} \vee n^{-1/2}\right)$, where $\underline{d}_{\mathbb{M}}$ is the lower Minkowski dimension of $\fM$. Thus if the low-dimensional structure $\fM$ is regular in the Minkowski sense, i.e. $\overline{d}_{\mathbb{M}} = \underline{d}_{\mathbb{M}}$, GANs are roughly minimax optimal in estimating distributions on $\fM$. We also show that the minimax estimation rate in the $p$-Wasserstein metric scales as $\Omega\left(n^{-\frac{1}{\underline{d}_{\mathbb{M}}-\delta}} \vee n^{-1/(2p)}\right)$.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Yunwen_Lei1
Submission Number: 3652
Loading