Rethinking News Text Classification from a Timeliness Perspective under the Pre-training and Fine-tuning Paradigm

Anonymous

Rethinking News Text Classification from a Timeliness Perspective under the Pre-training and Fine-tuning Paradigm

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone

Abstract: Pre-trained language models (PLMs) have made significant progress in NLP. News text classification is one of the most fundamental tasks in NLP, and various existing works have shown that fine-tuned on PLMs could score up to the accuracy of 98% on the target task. It seems that this task has been well-addressed. However, we discover that news timeliness can cause a massive impact on the news text classification, which drops nearly 20% points from the initial results. In this paper, we define timeliness issues in news classification and design the experiment to measure the influence. Moreover, we investigate several methods to recognize and replace obsolete vocabularies. However, the results show that it is difficult to eliminate the impact of news timeliness from the words' perspective. In addition, we propose a set of large-scale, time-sensitive news datasets to facilitate the study of this problem.

0 Replies

Loading