Zero-Round Active LearningDownload PDF

Sep 29, 2021 (edited Oct 06, 2021)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
  • Abstract: Active learning (AL) aims at reducing labeling efforts by identifying the most valuable unlabeled data points from a large pool. Traditional AL frameworks have two limitations: First, they perform data selection in a multi-round manner, which is time-consuming and impractical. Second, they usually assume that there are a small amount of labeled data points available \emph{in the same domain as} the data in the unlabeled pool. Our paper investigates a new setting in active learning---how to conduct active learning without relying on pre-labeled data, which is an under-explored yet of great practical value. Besides, we propose $D^2ULO$ as a solution that solves both issues, which leverages the idea of domain adaptation (DA) to train a data utility model that can effectively predict the utility for any given unlabeled data in the target domain once labeled. The trained data utility model can then be used to select high-utility data and at the same time, provide an estimate for the utility of the selected data. Our algorithm does not rely on any feedback from annotators in the target domain and hence, which is able to not only work standalone but also benefit existing multi-round active learning algorithms by providing a warm-start. Our experiments show that $D^2ULO$ outperforms the existing state-of-the-art AL strategies equipped with domain adaptation over various domain shift settings (e.g., real-to-real data and synthetic-to-real data). Particularly, $D^2ULO$ is applicable to the scenario where source and target labels have mismatch, which is not supported by the existing works.
  • Supplementary Material: zip
5 Replies