Combining MixMatch and Active Learning for Better Accuracy with Fewer Labels

Shuang Song; David Berthelot; Afshin Rostamizadeh

Combining MixMatch and Active Learning for Better Accuracy with Fewer Labels

Shuang Song, David Berthelot, Afshin Rostamizadeh

25 Sept 2019 (modified: 12 Oct 2025)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: active learning, semi-supervised learning

TL;DR: We combine MixMatch and active learning to obtain better accuracy with fewer labels and we follow this by a cost analysis comparing labeling data vs adding unlabeled data..

Abstract: We propose using active learning based techniques to further improve the state-of-the-art semi-supervised learning MixMatch algorithm. We provide a thorough empirical evaluation of several active-learning and baseline methods, which successfully demonstrate a significant improvement on the benchmark CIFAR-10, CIFAR-100, and SVHN datasets (as much as 1.5% in absolute accuracy). We also provide an empirical analysis of the cost trade-off between incrementally gathering more labeled versus unlabeled data. This analysis can be used to measure the relative value of labeled/unlabeled data at different points of the learning curve, where we find that although the incremental value of labeled data can be as much as 20x that of unlabeled, it quickly diminishes to less than 3x once more than 2,000 labeled example are observed.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/combining-mixmatch-and-active-learning-for/code)

Original Pdf: pdf

7 Replies

Loading