Keywords: supervised learning, multi-label learning, label ambiguity, label noise
Abstract: Transfer learning from ImageNet pre-trained models has become essential for many computer vision tasks. Recent studies have shown that ImageNet includes label ambiguity, where images with multiple object classes present are assigned a single label. This ambiguity biases models towards a single prediction, which could result in the suppression of classes that tend to co-occur in the data. Recent approaches have explored either fixing the evaluation datasets or using costly procedures to relabel the training data. In this work, we propose multi-label iterated learning (MILe) to incorporate the inductive biases of multi-label learning from single labels using the framework of iterated learning. MILe is a simple, yet effective procedure that alternates training a teacher and a student network with binary predictions to build a multi-label description of the images. Experiments on ImageNet show that MILe achieves higher accuracy and ReaL score than when using the standard training procedure, even when fine-tuning from self-supervised weights. We also show that MILe is effective for real-world large-scale noisy data such as WebVision. Furthermore, MILe improves performance in class incremental settings such as IIRC and is robust to distribution shifts.
One-sentence Summary: We introduce a multi-label iterated learning method to alleviate the problems of label ambiguity and label noise in supervised classification
Supplementary Material: zip