NewsEdits 2.0: Learning the Intentions Behind Updating News

ACL ARR 2024 June Submission3168 Authors

15 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: As events progress, news articles often update with new information: if we are not cautious, we risk propagating outdated facts. In this work, we hypothesize that linguistic features indicate factual fluidity, and that we can _predict which facts in a news article will update_ using solely the text of a news article (i.e. not external resources like search engines). We test this hypothesis, first, by isolating fact-updates in large news revisions corpora. News articles may update for many reasons (e.g. factual, stylistic, narrative). We introduce the _NewsEdits 2.0_ taxonomy, an edit-intentions schema that separates fact updates from stylistic and narrative updates in news writing. We annotate over 9,200 pairs of sentence revisions and train high-scoring ensemble models to apply this schema. Then, taking a large dataset of silver-labeled pairs, we show that we can predict when facts will update in older article drafts with high precision. Finally, to demonstrate the usefulness of these findings, we construct a language model question asking (LLM-QA) abstention task. Inspired by \newcite{kasai2022realtime}, we wish the LLM to abstain from answering questions when information is likely to become outdated. Using our predictions, we show, LLM absention reaches _near oracle levels of accuracy_.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: quantitative analyses of news and/or social media
Contribution Types: Model analysis & interpretability, Publicly available software and/or pre-trained models, Data resources, Data analysis
Languages Studied: English
Submission Number: 3168
Loading