Keywords: active learning, binary classification, competitive ratio
TL;DR: We propose an algorithm for agnostic active binary classification with good competitive ratio.
Abstract: For some hypothesis classes and input distributions, \emph{active}
agnostic learning needs exponentially fewer samples than passive
learning; for other classes and distributions, it offers little to
no improvement. The most popular algorithms for agnostic active
learning express their performance in terms of a parameter called
the disagreement coefficient, but it is known that these algorithms
are inefficient on some inputs.
We take a different approach to agnostic active learning, getting an
algorithm that is \emph{competitive} with the optimal algorithm for
any binary hypothesis class $H$ and distribution $\mathcal{D}_X$ over $X$.
In particular, if any algorithm can use $m^*$ queries to get
$O(\eta)$ error, then our algorithm uses $O(m^* \log H)$ queries to
get $O(\eta)$ error. Our algorithm lies in the vein of the
splitting-based approach of Dasgupta [2004], which gets a similar
result for the realizable ($\eta = 0$) setting.
We also show that it is NP-hard to do better than our algorithm's
$O(\log H)$ overhead in general.
Submission Number: 9162
Loading