Adaptive Textual Label Noise Learning based on Pre-trained Models

Shaohuan Cheng; Wenyu Chen; fu Mingsheng; Xuanting Xie; Hong Qu

Adaptive Textual Label Noise Learning based on Pre-trained Models

Shaohuan Cheng, Wenyu Chen, fu Mingsheng, Xuanting Xie, Hong Qu

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX

Submission Type: Regular Long Paper

Submission Track: Efficient Methods for NLP

Submission Track 2: Sentiment Analysis, Stylistic Analysis, and Argument Mining

Keywords: learning with noisy labels, label noise learning, pre-trained models, text classification

TL;DR: We develop an adaptive textual label noise learning framework based on pre-trained models, which can handle various noise scenarios well, including those with a noise range of up to 40% and mixed noise types.

Abstract: The label noise in real-world scenarios is unpredictable and can even be a mixture of different types of noise. To meet this challenge, we develop an adaptive textual label noise learning framework based on pre-trained models, which consists of an adaptive warm-up stage and a hybrid training stage. Specifically, an early stopping method, relying solely on the training set, is designed to dynamically terminate the warm-up process based on the model's fit level to different noise scenarios. The hybrid training stage incorporates several generalization strategies to gradually correct mislabeled instances, thereby making better use of noisy data. Experiments on multiple datasets demonstrate that our approach performs comparably or even surpasses the state-of-the-art methods in various noise scenarios, including scenarios with the mixture of multiple types of noise.

Submission Number: 1440

Loading