Abstract: Data annotations obtained for supervised learning often suffer from label noise, which would inevitably incur unreliable deep neural networks. Existing solutions to this problem typically limit the scope to instance-independent label noise. Due to the high illegibility of data and the inexperience of annotators, instance-dependent noise has also been widely observed, however, not being investigated. In this paper, we propose a novel \underlineIDE ntify and \underlineAL ign (IDEAL) methodology, which aims to eliminate the feature distribution shift raised by a broad spectrum of noise patterns. The proposed model is capable of learning noise-resilient feature representations, thereby correctly predicting data instances. More specifically, we formulate the robust learning against noisy labels as a domain adaptation problem by identifying noisy data (i.e., data samples with incorrect labels) and clean data from the dataset as two domains and minimizing their domain discrepancy in the feature space. In this framework, a high-order-ensemble adaptation network is devised to provide high-confidence predictions, according to which a specific criterion is defined for differentiating clean and noisy data. A new metric based on data augmentation is designed to measure the discrepancy between the clean and noisy domains. Along with a min-max learning strategy between the feature encoder and the classifier on the discrepancy, the domain gap will be bridged, which encourages a noise-resilient model. In-depth theoretical analysis and extensive experiments on widely-used benchmark datasets demonstrate the effectiveness of the proposed method.
0 Replies
Loading