Gaussian universality for approximately polynomial functions of high-dimensional data
Abstract: aussian universality results assert that the properties of many estimators remain unchanged when the input data are replaced by Gaussians. Such results have gained popularity in high-dimensional statistics and machine learning, as Gaussianity often substantially simplifies downstream analyses. Yet, an open question remains on when universality may cease to hold. To address this, we establish nearly optimal upper and lower bounds for Gaussian universality approximation, measured in Kolmogorov distance, over the class of approximately polynomial functions of high-dimensional random vectors. The upper bounds adapt the invariance principle of Mossel, O'Donnell and Oleszkiewicz (2010) for high-dimensional vectors and functions beyond multilinear forms. As applications, we obtain a delta method for high-dimensional data with non-Gaussian limits, a necessary and sufficient condition for asymptotic normality, and simple estimators that are asymptotically normal but for which bootstrap fails to be consistent. We also extend recent results on the high-dimensional degeneracy of non-degenerate U-statistics, phase transition of MMD in two-sample tests with imbalanced data, and confidence spheres for high-dimensional averages. Our lower bound is constructive and shows that, for polynomials of even degree , universality holds up to . As a corollary, the Gaussian polynomial approximation error of is not improvable for even-degree U-statistics and V-statistics. Our results also explain how universality results for U-statistics and V-statistics differ significantly in their dependence on dimensions.
Loading