Keywords: Open-set noise, Noisy labels
TL;DR: We theoretically and empirically analyze and validate the impact of open-set noise.
Abstract: To reduce reliance on labeled data, learning with noisy labels (LNL) has gained increasing attention. However, prevailing works typically assume that such datasets are primarily affected by closed-set noise (where the true/clean labels of noisy samples come from another known category), and ignore therefore the ubiquitous presence of open-set noise (where the true/clean labels of noisy samples may not belong to any known category).
In this paper, we formally refine the LNL problem setting considering the presence of open-set noise. We theoretically analyze and compare the effects of open-set noise and closed-set noise, as well as the effects between different open-set noise modes. We also analyze common open-set noise detection mechanisms based on prediction entropy values. To empirically validate the theoretical results, we construct two open-set noisy datasets - CIFAR100-O/ImageNet-O and introduce a novel open-set test set for the widely used WebVision benchmark. Our work suggests that open-set noise exhibits qualitatively and quantitatively distinct characteristics, and how to fairly and comprehensively evaluate models in this condition requires more exploration.
Primary Area: Machine vision
Submission Number: 11074
Loading