Can LLMs Verify Arabic Claims? Evaluating the Arabic Fact-Checking Abilities of Multilingual LLMs

Published: 12 Oct 2024, Last Modified: 14 Nov 2024SafeGenAi PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multilingual Fact-Checking, Large Language Models (LLMs), Arabic Natural Language Processing (NLP), Claim Verification, Cross-Lingual Prompting, Chain-of-Thought Reasoning, Self-Consistency, Zero-Shot Learning, Misinformation Detection, Arabic Claims Verification
TL;DR: This paper evaluates multilingual LLMs for Arabic fact-checking, comparing methods like Zero-Shot, Chain-of-Thought, and Cross-Lingual Prompting. Cross-Lingual Prompting yields the best results, significantly improving model accuracy.
Abstract: Large language models (LLMs) have demonstrated potential in fact-checking claims, yet their capabilities in verifying claims in multilingual contexts remain largely understudied. This paper investigates the efficacy of various prompting techniques, viz. Zero-Shot, English Chain-of-Thought, Self-Consistency, and Cross-Lingual Prompting, in enhancing the fact-checking and claim-verification abilities of LLMs for Arabic claims. We utilize 771 Arabic claims sourced from the X-fact dataset to benchmark the performance of four LLMs. To the best of our knowledge, ours is the first study to benchmark the inherent Arabic fact-checking abilities of LLMs stemming from their knowledge of Arabic facts, using a variety of prompting methods. Our results reveal significant variations in accuracy across different prompting methods. Our findings suggest that Cross-Lingual Prompting outperforms other methods, leading to notable performance gains.
Submission Number: 89
Loading