Abstract: Labeled data are critical to modern machine learning applications, but obtaining labels can be expensive. To mitigate this cost, machine learning methods, such as transfer learning, semi-supervised learning and active learning, aim to be \emph{label-efficient}: achieving high predictive performance from relatively few labeled examples. While obtaining the best label-efficiency in practice often requires combinations of these techniques, existing benchmark and evaluation frameworks do not capture a concerted combination of all such techniques. This paper addresses this deficiency by introducing LabelBench, a new computationally-efficient framework for joint evaluation of multiple label-efficient learning techniques. As an application of LabelBench, we introduce a novel benchmark of state-of-the-art active learning methods in combination with semi-supervised learning for fine-tuning pretrained vision transformers. Our benchmark demonstrates significantly better label-efficiencies than previously reported in active learning. LabelBench's modular codebase is open-sourced for the broader community to contribute label-efficient learning methods and benchmarks. The repository can be found at: https://github.com/EfficientTraining/LabelBench.
Certifications: Reproducibility Certification
Keywords: Label-Efficient Learning, Active Learning, Large Pretrained Models
Video: https://www.youtube.com/watch?v=07MJ1xHJBUc
Code: https://github.com/EfficientTraining/LabelBench
Assigned Action Editor: ~Yue_Zhao13
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 22
Loading