Least Probable Disagreement Region for Active LearningDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: active learning, uncertainty sampling, disagreement region, variation ratio, deep learning, distance, decision boundary
Abstract: Active learning strategy to query unlabeled samples nearer the estimated decision boundary at each step has been known to be effective when the distance from the sample data to the decision boundary can be explicitly evaluated; however, in numerous cases in machine learning, especially when it involves deep learning, conventional distance such as the $\ell_p$ from sample to decision boundary is not readily measurable. This paper defines a theoretical distance of unlabeled sample to the decision boundary as the least probable disagreement region (LPDR) containing the unlabeled sample, and it discusses how this theoretical distance can be empirically evaluated with a lower order of time complexity. Monte Carlo sampling of the hypothesis is performed in approximating the theoretically defined distance. Experimental results on various datasets show that the proposed algorithm consistently outperforms all other high performing uncertainty based active learning algorithms and leads to state-of-the-art active learning performance on CIFAR10, CIFAR100, Tiny ImageNet and Food101 datasets. Only the proposed algorithm outperforms random sampling on CIFAR100 dataset using K-CNN while all other algorithms fail to do so.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: This paper defines a sample distance to the decision boundary as the least probable disagreement region (LPDR) containing the sample, and proposes querying unlabeled samples near the decision boundary with lower time complexity in active learning.
Supplementary Material: zip
Reviewed Version (pdf): https://openreview.net/references/pdf?id=Wg-hniER8I
10 Replies

Loading