Keywords: Label Noise, Noisy Label Detection, Deep Learning, Training Dynamics, Data Filtering
TL;DR: We propose a novel, lightweight method to detect and remove noisy labels using local information in a sample's neighborhood across epochs.
Abstract: Noisy labels are a pervasive challenge in modern supervised learning, especially in high-stakes domains such as healthcare, where model reliability is critical. Detecting and mitigating the influence of mislabeled data is essential to improving both performance and interpretability. Building on insights from training dynamics, we propose $\textbf{Lo}\text{cal}\ \textbf{C}\text{onsistency}\ \textbf{a}\text{cross} \ \textbf{T}\text{raining}\ \textbf{E}\text{pochs}$ (LoCaTE), a family of data-filtering methods that leverages over-parameterized neural networks to distinguish clean samples from mislabeled ones. Our approach integrates both local neighborhood information and per-epoch behavior to identify noise and enhance robustness. Evaluated on CIFAR-10/100 under four canonical noise regimes as well as Clothing-1M, LoCaTE achieves competitive $F_{1}$ scores and improves downstream accuracy by up to seven percentage points. We additionally conduct ablations by studying the performance of LoCaTE on a single epoch. These results highlight LoCaTE as a practical, low-overhead tool for reliable training on noisy datasets.
Submission Number: 176
Loading