NewsEdits 2.0: Learning the Intentions Behind Updating News

ACL ARR 2024 April Submission464 Authors

16 Apr 2024 (modified: 07 Jun 2024)ACL ARR 2024 April SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: As events progress, news articles often update with new information: if we are not cautious, we risk propagating outdated facts in many applications (e.g. large language model question asking (LLM Q\&A)). In this work, we address this by _predicting which facts in a news article will update_. In the first part of this work, we isolate fact-updates in news revisions. This is challenging: although large news revisions corpora have been published, news articles may update for many reasons (e.g. factual, stylistic, narrative). We introduce the _NewsEdits 2.0_ taxonomy, an edit-intentions schema that separates fact updates from stylistic and narrative updates in news writing, annotate over 9,200 pairs of sentence revisions and train high-scoring ensemble models to apply this schema. Then, taking a large dataset of silver-labeled pairs, we show we can predict when facts will update in older article drafts. _Linguistic cues exist in news-writing that signal factual fluidity_ and these can be learned with a big-data approach. With this insight, we demonstrate the value of these predictions by inducing LLMs to abstain from answering questions information is likely to be outdated. Using our models, LLM absention reaches _nearly oracle levels of accuracy_.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: quantitative analyses of news and/or social media
Contribution Types: Model analysis & interpretability, Publicly available software and/or pre-trained models, Data resources, Data analysis, Theory
Languages Studied: English
Submission Number: 464
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview