Do We Need Language-Specific Fact-Checking Models? The Case of Chinese

ACL ARR 2024 April Submission407 Authors

15 Apr 2024 (modified: 25 May 2024)ACL ARR 2024 April SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: This paper investigates the potential benefits of language-specific fact-checking models, focusing on the case of Chinese using CHEF dataset. To better reflect real-world fact-checking, we first develop a novel Chinese document-level evidence retriever, achieving state-of-the-art performance. We then demonstrate the limitations of translation-based methods and multilingual language models, highlighting the need for language-specific systems. To better analyze token-level biases in different systems, we construct an adversarial dataset based on the CHEF dataset, where each instance has a large word overlap with the original one but holds the opposite veracity label. Experimental results on the CHEF dataset and our adversarial dataset show that our proposed method outperforms translation-based methods and multilingual language models and is more robust toward biases, emphasizing the importance of language-specific fact-checking systems.
Paper Type: Short
Research Area: NLP Applications
Research Area Keywords: fact checking
Languages Studied: English, Chinese
Submission Number: 407
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview