Get a Head Start: Targeted Labeling at Source with Limited Annotation Overhead for Semi-Supervised Learning

Hui Zhu, Yongchun Lü, Qin Ma, Xunyi Zhou, Fen Xia, Guoqing Zhao, Ning Jiang, Xiaofang Zhao

Published: 2023, Last Modified: 11 Apr 2025ICME 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Semi-supervised learning (SSL), which leverages limited labeled data and a large amount of unlabeled data for model training, has been widely studied to mitigate the requirement for expensive and time-consuming annotations. Recently proposed methods have achieved promising yet unstable results, which presume that initial samples are randomly selected and labeled. For improving the fluctuated performance while saving annotation overhead, effective prior labeling for SSL on the source cluttered unlabeled dataset is challenging but significant. In this paper, we propose a novel criterion and a distribution balance strategy to automatically achieve targeted labeling without access to the test set and any labels. Comprehensive experiments are conducted on commonly-used datasets to demonstrate the effectiveness of our method. Furthermore, targeted labeling is orthogonal to existing framework-centric SSL methods and can achieve state-of-the-art performance.