Abstract: Within machine learning, active learning studies
the gains in performance made possible by adaptively selecting data points to label. In this work,
we show through upper and lower bounds, that for
a simple benign setting of well-specified logistic
regression on a uniform distribution over a sphere,
the expected excess error of both active learning
and random sampling have the same inverse proportional dependence on the number of samples.
Importantly, due to the nature of lower bounds,
any more general setting does not allow a better
dependence on the number of samples. Additionally, we show a variant of uncertainty sampling
can achieve a faster rate of convergence than random sampling by a factor of the Bayes error, a
recent empirical observation made by other work.
Qualitatively, this work is pessimistic with respect
to the asymptotic dependence on the number of
samples, but optimistic with respect to finding
performance gains in the constants.
0 Replies
Loading