Do We Need Language-Specific Fact-Checking Models? The Case of Chinese

Do We Need Language-Specific Fact-Checking Models? The Case of Chinese

ACL ARR 2024 April Submission407 Authors

15 Apr 2024 (modified: 25 May 2024)ACL ARR 2024 April SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: This paper investigates the potential benefits of language-specific fact-checking models, focusing on the case of Chinese using CHEF dataset. To better reflect real-world fact-checking, we first develop a novel Chinese document-level evidence retriever, achieving state-of-the-art performance. We then demonstrate the limitations of translation-based methods and multilingual language models, highlighting the need for language-specific systems. To better analyze token-level biases in different systems, we construct an adversarial dataset based on the CHEF dataset, where each instance has a large word overlap with the original one but holds the opposite veracity label. Experimental results on the CHEF dataset and our adversarial dataset show that our proposed method outperforms translation-based methods and multilingual language models and is more robust toward biases, emphasizing the importance of language-specific fact-checking systems.

Paper Type: Short

Research Area: NLP Applications

Research Area Keywords: fact checking

Languages Studied: English, Chinese

Submission Number: 407

Loading