Finding the Sweet Spot: Batch Selection for One-Class Active Learning

Adrian Englhardt, Holger Trittenbach, Dennis Vetter, Klemens Böhm

Published: 01 Jan 2020, Last Modified: 12 May 2023SDM 2020Readers: Everyone

Abstract: Active learning methods collect annotations in the form of class labels, often from human experts, to improve upon some classification task. In many cases, one can collect annotations for a batch of observations at a time, e.g., when several annotators are available. This can make the annotation process more efficient, both regarding human and computational resources. However, selecting a good batch is difficult. It requires to understand several trade-offs between the costs of classifier training, batch selection, annotation effort, and classification accuracy. For one-class classification, a very important application of active learning, batch selection has not received any attention in the literature so far. In this article, we strive to find a sweet spot between the costs of batch-mode learning and classification accuracy. To this end, we first frame batch selection as an optimization problem. We then propose several strategies to identify good batches, discuss their properties, and evaluate them on real-world data. A core result is that a sweet spot indeed exists, with active learning costs reduced by up to an order of magnitude compared to a sequential procedure, without sacrificing accuracy.

0 Replies