Abstract: Veracity detection has emerged as a crucial NLP task over the last decade, as misinformation spreads rapidly in the digital age. Most datasets available in the community, such as LIAR (Wang, 2017) and PolitiFact (Alhindi005
et al., 2018), either lack complete evidence or have a limited number of instances. We present a new dataset, $\textit{Politifact-PLUS}$, contains $\textit{claim}$, $\textit{evidence}$, $\textit{speaker}$, $\textit{label}$ and designed for the task of 5-class veracity detection, with particular focus on the political domain. Our analysis examines the efficacy of large language models (LLMs) using prompting approaches, alongside a multi-agent task decomposition framework for veracity detection on our dataset. Notably, we found that the few-shot prompting technique achieved the highest F1 score of $\textbf{0.7603}$, while the task decomposition approach yielded an F1 score of $\textbf{0.6611}$. Our findings highlight the significant confusion among the classes of Mostly True, Half True, and Mostly False. We hope this work inspires the community to develop more robust techniques for veracity detection.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: fact checking, rumor/misinformation detection
Contribution Types: Model analysis & interpretability, Data resources, Data analysis
Languages Studied: English
Submission Number: 1677
Loading