The Role of Label Noise in the Feature Learning Process

Andi Han; Wei Huang; Zhanpeng Zhou; Gang Niu; Wuyang Chen; Junchi Yan; Akiko Takeda; Taiji Suzuki

The Role of Label Noise in the Feature Learning Process

Andi Han, Wei Huang, Zhanpeng Zhou, Gang Niu, Wuyang Chen, Junchi Yan, Akiko Takeda, Taiji Suzuki

24 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Label noise, Feature Learning, Training Dynamics

TL;DR: We theoretically characterize the role of label noise in training neural networks from a feature learning perspective, identifying two-stages in the training dynamics.

Abstract: Deep learning with noisy labels presents significant challenges. In this work, we theoretically characterize the role of label noise in training neural networks from a feature learning perspective. Specifically, we consider a *signal-noise* data distribution, where each data point comprises a label-dependent signal and label-independent noise, and rigorously analyze the training dynamics of a two-layer convolutional neural network under this data setting, along with the presence of label noise. Particularly, we identify two stages in which the dynamics exhibit distinct patterns. In *Stage I*, the model perfectly fits all the clean samples (i.e., samples without label noise) while ignoring the noisy ones (i.e., samples with noisy labels). In the first stage, the model learns the signal from the clean samples, which generalizes well on unseen data. In *Stage II*, as the training loss converges, the gradient in the direction of noise surpasses that of the signal, leading to over-fitting on noisy samples. Eventually, the model memorizes the noise present in the noisy samples, which degrades its generalization ability. In contrast, when training without label noise, the dynamics do not exhibit this two-stage pattern. Furthermore, our results provide theoretical supports for two widely used techniques for tackling label noise: early stopping and sample selection. Experiments on both synthetic and real-world datasets confirm our theoretical findings.

Supplementary Material: zip

Primary Area: other topics in machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 3468

Loading