Keywords: label noise, robustness, subspace learning, Sparse PCA
TL;DR: We improve label-noise robustness of low-dimensional subspace learning by applying sparse PCA.
Abstract: Recent work shows that deep neural networks (DNNs) first learn clean samples and then memorize noisy samples. Early stopping can therefore be used to improve performance when training with noisy labels. It was also shown recently that the training trajectory of DNNs can be approximated in a low-dimensional subspace using PCA. The DNNs can then be trained in this subspace achieving similar or better generalization. These two observations were utilized together, to further boost the generalization performance of vanilla early stopping on noisy label datasets. In this paper, we probe this finding further on different real-world and synthetic label noises. First, we show that the prior method is sensitive to the early stopping hyper-parameter. Second, we investigate the effectiveness of PCA, for approximating the optimization trajectory under noisy label information. We propose to estimate low-rank subspace through robust and structured variants of PCA, namely Robust PCA, and Sparse PCA. We find that the subspace estimated through these variants can be less sensitive to early stopping, and can outperform PCA to achieve better test error when trained on noisy labels.