Keywords: robust learning, malicious noise, contamination, outlier removal
TL;DR: We provide a wide range of nearly optimal guarantees for several fundamental problems in robust supervised learning based on a single iterative polynomial filtering algorithm.
Abstract: Inspired by recent work on learning with distribution shift, we give a
general outlier removal algorithm called *iterative polynomial
filtering* and show a number of striking applications for supervised
learning with contamination:
(1) We show that any function class that can be approximated by
low-degree polynomials with respect to a hypercontractive distribution
can be efficiently learned under bounded contamination (also
known as *nasty noise*).  This is a surprising resolution to a
longstanding gap between the complexity of agnostic learning and
learning with contamination, as it was widely believed that low-degree
approximators only implied tolerance to label noise.
(2) For any function class that admits the (stronger) notion of
sandwiching approximators, we obtain near-optimal learning guarantees
even with respect to heavy additive contamination, where far more than
$1/2$ of the training set may be added adversarially. Prior
related work held only for regression and in a list-decodable setting.
(3) We obtain the first efficient algorithms for tolerant testable
learning of functions of halfspaces with respect to any fixed
log-concave distribution.  Even the non-tolerant case for a single
halfspace in this setting had remained open.
These results significantly advance our understanding of efficient
supervised learning under contamination, a setting that has been much
less studied than its unsupervised counterpart.
Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)
Submission Number: 23131
Loading