"Politi-Fact-Only": A Political Domain Benchmark Dataset

"Politi-Fact-Only": A Political Domain Benchmark Dataset

ACL ARR 2025 February Submission7826 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: The rapid proliferation of online information has made it increasingly challenging to differentiate factual content from misinformation. Traditional fact-checking methods, which require extensive manual effort, are not scalable given the volume of misinformation spreading online. Automated fact-checking has emerged as a promising solution, leveraging machine learning models trained on datasets derived from fact-checking websites (Wang, 2017; Augenstein et al., 2019; Gupta and Srikumar, 2021). However, many of these datasets include post-analysis commentary from annotators, which may introduce bias and provide implicit cues that aid model performance. To address this limitation, we introduce Politi-Fact-Only, a benchmark dataset comprising 1,482 instances curated from PolitiFact.com, where we remove post-analysis and retain only factual evidence Fig 1. This ensures that models must rely solely on factual reasoning rather than verdict-related information. Our experiments demonstrate that state-of-the-art fact-checking models, including large language models (LLMs), struggle to accurately classify claims when deprived of post-claim analysis, highlighting their reliance on implicit cues rather than pure factual reasoning.

Paper Type: Short

Research Area: NLP Applications

Research Area Keywords: fact checking, rumor/misinformation detection

Languages Studied: English

Submission Number: 7826

Loading