Multilingual Fact-Checking using LLMs

ACL ARR 2024 June Submission5263 Authors

16 Jun 2024 (modified: 02 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Due to the recent rise in digital misinformation, there has been great interest shown in using LLMs for fact-checking and claim verification. In this paper, we answer the question: Do LLMs know multilingual facts and can they use this knowledge for effective fact-checking? To this end, we create a benchmark by filtering multilingual claims from the X-fact dataset and evaluating the multilingual fact-checking capabilities of five LLMs across five diverse languages: Spanish, Italian, Portuguese, Turkish, and Tamil on our benchmark. We employ three different prompting techniques: Zero-Shot, English Chain-of-Thought, and Cross-Lingual Prompting, using both greedy and self-consistency decoding. We extensively analyze our results and find that GPT-4o achieves the highest accuracy, but zero-shot prompting with self-consistency was the most effective overall. We also show that techniques like Chain-of-Thought and Cross-Lingual Prompting, which are designed to improve reasoning abilities, do not necessarily improve the fact-checking abilities of LLMs. Interestingly, we find a strong negative correlation between model accuracy and the amount of internet content for a given language. This suggests that LLMs are better at fact-checking from knowledge in low-resource languages. We hope that this study will encourage more work on multilingual fact-checking using LLMs.
Paper Type: Long
Research Area: Multilingualism and Cross-Lingual NLP
Research Area Keywords: Multilingual NLP, Cross-Lingual NLP, Language Models, LLMs, Fact-Checking, Claim Verification, Multilingual Fact-Checking, Prompting Techniques, High-Resource Languages, Low-Resource Languages, Multilingual Evaluation
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches to low-resource settings, Data analysis
Languages Studied: Spanish, Portuguese, Italian, Turkish, Tamil
Submission Number: 5263
Loading