A review on label cleaning techniques for learning with noisy labels

Jongmin Shin, Jonghyeon Won, Hyun-Suk Lee, Jang-Won Lee

Published: 2024, Last Modified: 27 Sept 2025ICT Express 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Classification models categorize objects into given classes, guided by training samples with input features and labels. In practice, however, labels can be corrupted by human error or mistakes, known as label noise, which degrades classification accuracy. To address this issue, recently, various works propose the algorithms to clean datasets with label noise. We categorize the algorithms in granular ways, and review the algorithms, such as sample selection, label correction, and select-and-correct algorithms, based on the categorization. In addition, we provide future research directions for cleaning datasets, considering practical challenges, such as class imbalance, class incremental learning, and corrupted input features.

External IDs:dblp:journals/ict-express/ShinWLL24