Abstract: The task of Extractive Question Answering (EQA) involves identifying correct answer spans in response to provided questions and passages. The emergence of Pretrained Language Models (PLMs) has sparked increased interest in leveraging these models for EQA tasks, yielding promising results. Nonetheless, current approaches frequently neglect the issue of label noise, which arises from incomplete labeling and inconsistent annotations, thereby reducing the model performance. To address this issue, we propose the Contrastive Puzzles and Reweighted Clues (CPRC) method, designed to mitigate the adverse effects of label noise. Our approach involves categorizing training data into Puzzle and Clue samples based on their loss and text similarity to the golden answer during model training. Subsequently, CPRC incorporates a hybrid intra- and inter-contrastive learning approach for Puzzle samples and dynamically adjusts the weights of Clue samples, respectively. The experimental results, conducted on three benchmark datasets, demonstrates the superior performance of the proposed CPRC compared to conventional approaches, highlighting its efficacy in mitigating the label noise and achieving enhanced EQA performance.
Loading