Published: 09 Nov 2021, Last Modified: 05 May 2023NeurIPS 2021 Poster
Keywords: Active learning, Deep Neural Networks, Neural Tangent Kernel, Selective Sampling, theoretical guarantees
TL;DR: We investigate online active learning in non-parametric regimes under the NTK approximation, and derive theoretical guarantees for the proposed algorithms.
Abstract: We investigate the problem of active learning in the streaming setting in non-parametric regimes, where the labels are stochastically generated from a class of functions on which we make no assumptions whatsoever. We rely on recently proposed Neural Tangent Kernel (NTK) approximation tools to construct a suitable neural embedding that determines the feature space the algorithm operates on and the learned model computed atop. Since the shape of the label requesting threshold is tightly related to the complexity of the function to be learned, which is a-priori unknown, we also derive a version of the algorithm which is agnostic to any prior knowledge. This algorithm relies on a regret balancing scheme to solve the resulting online model selection problem, and is computationally efficient. We prove joint guarantees on the cumulative regret and number of requested labels which depend on the complexity of the labeling function at hand. In the linear case, these guarantees recover known minimax results of the generalization error as a function of the label complexity in a standard statistical learning setting.
