Keywords: Robust Mean Estimation, Unbounded First Moment, Symmetric Distributions (Spherical, Elliptical, Product), Filtering Algorithm, Huber Loss
TL;DR: We give polynomial-time and in some cases nearly optimal algorithms for robustly estimating the location parameter of symmetric distributions.
Abstract: We study the problem of robustly estimating the mean or location parameter without moment assumptions.
Known computationally efficient algorithms rely on strong distributional assumptions, such as sub-Gaussianity, or (certifiably) bounded moments.
Moreover, the guarantees that they achieve in the heavy-tailed setting are weaker than those for sub-Gaussian distributions with known covariance.
In this work, we show that such a tradeoff, between error guarantees and heavy-tails, is not necessary for symmetric distributions.
We show that for a large class of symmetric distributions, the same error as in the Gaussian setting can be achieved efficiently.
The distributions we study include products of arbitrary symmetric one-dimensional distributions, such as product Cauchy distributions, as well as elliptical distributions,
a vast generalization of the Gaussian distribution.
For product distributions and elliptical distributions with known scatter (covariance) matrix, we show that given an $\varepsilon$-corrupted sample, we can with probability at least $1-\delta$ estimate its location up to error $O(\varepsilon \sqrt{\log(1/\varepsilon)})$ using $\tfrac{d\log(d) + \log(1/\delta)}{\varepsilon^2 \log(1/\varepsilon)}$ samples.
This result matches the best-known guarantees for the Gaussian distribution and known SQ lower bounds (up to the $\log(d)$ factor).
For elliptical distributions with unknown scatter (covariance) matrix, we propose a sequence of efficient algorithms that approaches this optimal error.
Specifically, for every $k \in \mathbb{N}$, we design an estimator using time and
samples $\tilde{O}({d^k})$ achieving error $O(\varepsilon^{1-\frac{1}{2k}})$.
This matches the error and running time guarantees when assuming certifiably bounded moments of order up to $k$.
For unknown covariance, such error bounds of $o(\sqrt{\varepsilon})$ are not even known for (general) sub-Gaussian distributions.
Our algorithms are based on a generalization of the well-known filtering technique [DK22].
More specifically, we show how this machinery can be combined with Huber-loss-based
techniques to work with projections of the noise that behave more nicely than the initial noise.
Moreover, we show how sum-of-squares proofs can be used to obtain algorithmic guarantees even for distributions without a first moment.
We believe that this approach may find other applications in future works.
Supplementary Material: pdf
Submission Number: 13072
Loading