Abstract: In fact-checking, claims, evidence and final verdicts have been made standardized by the ClaimReview schema, but annotation efforts providing automated fact-checking data usually do not follow this detailed and work-intensive documentation process. Through automated extraction of structured information, we leverage efforts by professional fact-checkers to contribute a new dataset of 17,000 fact-checking articles. We then propose an automated extraction method based on few-shot inference to jointly extract decontextualized evidence, justifications and final verdicts. Our human evaluation show very high quality of the extracted content. We also benchmark state-of-the-art large language models (LLMs) on the tasks of justification generation and claim verification, we find that decontextualization yields slightly better performance compared to extracted evidence and that LLMs rely on parametric knowledge when evidence is not explicitly provided.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Automated Fact-checking, LLM-based information extraction, dataset construction
Contribution Types: NLP engineering experiment, Data resources, Data analysis
Languages Studied: English
Submission Number: 8279
Loading