LNL+K: Enhancing Learning with Noisy Labels Through Noise Source Knowledge Integration

22 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: robust learning, learning with noisy labels
TL;DR: Introduce a novel task, termed LNL+K(enhancing Learning with Noisy Labels through noise source Knowledge integration), which applies the noise source knowledge in learning with noisy labels.
Abstract: Learning with noisy labels (LNL) aims to train a high-performing model using a noisy dataset. We observe that noise for a given class often comes from a limited set of categories, yet many LNL methods overlook this. For example, an image mislabeled as a cheetah is more likely a leopard than a hippopotamus due to its visual similarity. In fact, we find that many datasets have meta-data information that directly provides potential noise sources. Thus, in this paper, we explore a task we refer to as Learning with Noisy Labels with noise source Knowledge integration (LNL+K), which assumes we have some knowledge about likely source(s) of label noise that we can take advantage of. We find that integrating noise source knowledge boosts performance, even supporting settings where LNL methods typically fail. For example, LNL+K methods are effective on datasets where noise represents the majority of samples, which breaks a critical premise of most methods developed for the LNL task. We also find that LNL+K methods can boost performance even when the noise sources are estimated rather than provided in the meta-data. Our experiments provide several baseline LNL+K methods that integrate noise source knowledge into state-of-the-art LNL models across five diverse datasets and three types of noise, where we report gains of up to 15% compared to the unadapted methods. Critically, we show that LNL methods fail to generalize on some real-world datasets, even when adapted to integrate noise source knowledge, highlighting the importance of directly exploring our LNL+K task.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5761
Loading