Rethinking Open-set Noise in Learning with Noisy Labels

14 May 2024 (modified: 06 Nov 2024)Submitted to NeurIPS 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Open-set noise, Noisy labels
TL;DR: We theoretically and empirically analyze and validate the impact of open-set noise.
Abstract: To reduce reliance on labeled data, learning with noisy labels (LNL) has gained increasing attention. However, prevailing works typically assume that such datasets are primarily affected by closed-set noise (where the true/clean labels of noisy samples come from another known category), and ignore therefore the ubiquitous presence of open-set noise (where the true/clean labels of noisy samples may not belong to any known category). In this paper, we formally refine the LNL problem setting considering the presence of open-set noise. We theoretically analyze and compare the effects of open-set noise and closed-set noise, as well as the effects between different open-set noise modes. We also analyze common open-set noise detection mechanisms based on prediction entropy values. To empirically validate the theoretical results, we construct two open-set noisy datasets - CIFAR100-O/ImageNet-O and introduce a novel open-set test set for the widely used WebVision benchmark. Our work suggests that open-set noise exhibits qualitatively and quantitatively distinct characteristics, and how to fairly and comprehensively evaluate models in this condition requires more exploration.
Primary Area: Machine vision
Submission Number: 11074
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview