Uncertainty-Based Instance-Dependent Noisy Label Datasets Generation

Published: 2025, Last Modified: 27 Jan 2026IEA/AIE (1) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In real-world applications, noisy labels degrade the generalization performance of the model. Among various types of noise, instance-dependent noise (IDN) reflects the features of individual samples. While extensive research on noisy labels has been conducted, pre-defined IDN datasets remain scarce. To address this issue, we propose a method for generating IDN datasets based on the uncertainty of each data sample. Higher uncertainty implies lower confidence in the label, increasing the likelihood of forming noisy labels. We complement the previous uncertainty quantification method and assign noisy labels to the top \(r\%\) of data samples with high predictive uncertainty by ensembling the predictions of stochastic models generated through Monte Carlo dropout. We demonstrate the effectiveness of our proposed method by comparing the similarity between human-assigned noisy labels and generated noisy labels.
Loading