Learning From Noisy Labels via Dynamic Loss Thresholding

Published: 01 Jan 2024, Last Modified: 30 Jan 2025IEEE Trans. Knowl. Data Eng. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Numerous researches have proved that deep neural networks (DNNs) can fit almost everything even given data with noisy labels, and result in poor generalization performance. However, recent studies suggest that DNNs tend to gradually memorize the data, moving from correct data to mislabeled data. Inspired by this finding, we propose a novel method named Dynamic Loss Thresholding (DLT). During the training process, DLT records the loss value of each sample and calculates dynamic loss thresholds. Specifically, DLT compares the loss value of each sample with the current loss threshold. Samples with smaller losses can be considered as clean samples with higher probability and vice versa. Then, DLT discards the potentially corrupted labels and further leverages self-training semi-supervised learning techniques. Experiments on CIFAR-10/100, WebVision and Clothing1M demonstrate substantial improvements over recent state-of-the-art methods. In addition, we investigate two real-world problems. First, we propose a novel approach to estimate the noise rates of datasets based on the loss difference between the early and late training stages of DNNs. Second, we explore the effect of hard samples (which are difficult to be distinguished) on the process of learning from noisy labels.
Loading