Abstract: Classical semi-supervised approaches achieve state-of-the-art results on various visual recognition tasks, especially image classification, but they are typically designed with expert knowledge of the task at hand such as task-specific data augmentation. However, these approaches do not generalize to novel tasks such as image segmentation and surface normal estimation. In this work, we instead study self-training for a wide variety of tasks in a task-agnostic fashion. We find out a simple success recipe: to construct a continuous schedule of learning updates that iterates between self-training on novel segments of the streams of unlabeled data and fine-tuning on the small and fixed labeled data. Our task-agnostic self-training approach works with a few labeled samples per task by leveraging millions of unlabeled web images, and it requires neither enormous computational resources to process data nor domain-specific unlabeled data, which are assumed in most prior works. We show that our simple approach, without hyper-parameter tuning, can be as effective as state-of-the-art semi-supervised learning method (Fixmatch) that is designed with task-specific knowledge for image classification. Furthermore, we demonstrate the findings for both (1) pixel-level tasks such as surface normal estimation and segmentation, and (2) diverse domains with extreme differences to web images, including medical, satellite, and agricultural imagery, where there does not exist a large amount of labeled or unlabeled data. The experiments consistently suggest that ours is a competitive baseline to consider before developing compute-heavy and task-specific semi-supervised methods.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Yanwei_Fu2
Submission Number: 1355
Loading