Abstract: Active learning (AL) aims to reduce annotation costs while maximizing model performance by iteratively selecting valuable instances.
While foundation models have made it easier to identify these instances, existing selection strategies still lack robustness across different models, annotation budgets, and datasets.
To quantify the performance gains that are still attainable and to establish a reference point for research, we explore oracle strategies, i.e., upper baseline strategies approximating the optimal selection by accessing ground truth information unavailable in practical AL scenarios. Current oracle strategies, however, fail to scale effectively to large datasets and complex deep neural networks. To tackle these limitations, we introduce the Best-of-Strategy Selector (BoSS), a scalable oracle strategy designed for large-scale AL scenarios.
Boss constructs a set of candidate batches through an ensemble of selection strategies and then selects the batch yielding the highest performance gain. As an ensemble of selection strategies, BoSS can be easily extended with new state-of-the-art strategies as they emerge, ensuring it remains a reliable upper baseline in the future.
Our evaluation demonstrates that i) BoSS outperforms existing oracle strategies, ii) state-of-the-art AL strategies have significant room for improvement, especially in large-scale datasets with many classes, and iii) one possible solution to counteract the inconsistent performance of AL strategies is to employ an ensemble‑based approach for the selection.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Ozan_Sener1
Submission Number: 5488
Loading