Keywords: Active learning, Agnostic learning, PAC learning, Query complexity, Minimax analysis, VC dimension, Star number, Disagreement coefficient
TL;DR: We prove that for every concept class, the optimal query complexity of agnostic active learning is strictly smaller than the sample complexity of agnostic passive learning.
Abstract: We sharply characterize the optimal first-order query complexity of agnostic active learning for all concept classes, and propose a new general active learning algorithm which achieves it. Remarkably, the optimal query complexity admits a leading term which is always strictly smaller than the sample complexity of passive supervised learning (by a factor proportional to the best-in-class error rate). This was not previously known to be possible in the agnostic setting. For comparison, in all previous general analyses, the leading term exhibits an additional factor, such as the disagreement coefficient or related complexity measure, and therefore only provides improvements over passive learning in restricted cases. The present work completely removes such factors from the leading term, implying that $\textit{every}$ concept class benefits from active learning in the non-realizable case. The results established in this work resolve an important long-standing open question central to the past two decades of research on the theory of agnostic active learning.
Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)
Submission Number: 25001
Loading