Keywords: Learning Theory, Active Learning, ReLU Regression, Agnostic Learning
TL;DR: We design the first efficient algorithm that solves the general agnostic ReLU regression with optimal query complexity
Abstract: We study the task of
agnostically learning general
(as opposed to homogeneous) ReLUs
under the Gaussian distribution with respect
to the squared loss. In the passive learning setting,
recent work gave a computationally efficient algorithm
that uses $poly(d,1/\epsilon)$ labeled examples
and outputs a hypothesis with error $O(opt)+\epsilon$,
where $opt$ is the squared loss of the best fit ReLU.
Here we focus on
the interactive setting, where the learner
has some form of query access to the labels of unlabeled
examples.
Our main result is the first computationally
efficient learner
that uses
$d polylog(1/\epsilon)+\tilde{O}(\min\{1/p, 1/\epsilon\})$
black-box label queries,
where $p$ is the bias of the target function, and achieves error $O(opt)+\epsilon$.
We complement our algorithmic result by showing
that its query complexity
bound is qualitatively near-optimal,
even ignoring computational constraints.
Finally, we establish that query access
is essentially necessary
to improve on the label complexity of passive learning. Specifically, for pool-based active learning,
any active learner
requires $\tilde{\Omega}(d/\epsilon)$ labels,
unless it draws a super-polynomial
number of unlabeled examples.
Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)
Submission Number: 24731
Loading