Mislabeled examples detection viewed as probing machine learning models: concepts, survey and extensive benchmark

TMLR Paper2960 Authors

04 Jul 2024 (modified: 17 Sept 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Mislabeled examples are ubiquitous in real world machine learning datasets. We show that most mislabeled detection methods can be viewed as probing trained machine learning models using a few core principles. We formalize a modular framework that encompasses these methods, parameterized by only 4 building blocks, as well as a python library that showcases that these principles can actually be implemented. The focus is put on classifier-agnostic concepts, with an emphasis on adapting methods developed for deep learning models to non-deep classifiers for tabular data. We benchmark existing methods on (artificial) Completely At Random (NCAR) as well as (actual) Not At Random (NNAR) labeling noise coming from a series of tasks with imperfect labeling rules. This benchmark offers new insights as well as limitations of existing methods in this setup.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Aditya_Menon1
Submission Number: 2960
Loading