Get a Head Start: Targeted Labeling at Source with Limited Annotation Overhead for Semi-Supervised Learning
Abstract: Semi-supervised learning (SSL), which leverages limited labeled data and a large amount of unlabeled data for model training, has been widely studied to mitigate the requirement for expensive and time-consuming annotations. Recently proposed methods have achieved promising yet unstable results, which presume that initial samples are randomly selected and labeled. For improving the fluctuated performance while saving annotation overhead, effective prior labeling for SSL on the source cluttered unlabeled dataset is challenging but significant. In this paper, we propose a novel criterion and a distribution balance strategy to automatically achieve targeted labeling without access to the test set and any labels. Comprehensive experiments are conducted on commonly-used datasets to demonstrate the effectiveness of our method. Furthermore, targeted labeling is orthogonal to existing framework-centric SSL methods and can achieve state-of-the-art performance.
Loading