Learning With Non-Uniform Label Noise: A Cluster-Dependent Weakly Supervised Approach

Mengtian Zhang, Bo Jiang, Yuye Ling, Xinbing Wang

Published: 2024, Last Modified: 15 Jan 2026ICASSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Learning with noisy labels is a challenging task in machine learning. Furthermore in reality, label noise can be highly non-uniform in feature space, e.g. with higher error rate for more difficult samples. Some recent works consider instance-dependent label noise but they require additional information such as some cleanly labeled data and confidence scores, which are usually unavailable or costly to obtain. In this paper, we consider learning with non-uniform label noise that requires no such additional information. Inspired by stratified sampling, we propose a cluster-dependent sample selection algorithm followed by a contrastive training mechanism based on the cluster-dependent label noise. Despite its simplicity, the proposed method can distinguish clean data from the corrupt ones more precisely and achieve state-of-the-art performance on most image classification benchmarks, especially when the number of training samples is small and the noise rate is high. The code is released at https://github.com/MattZ-99/ClusterCL.