MCFEND: A Multi-source Benchmark Dataset for Chinese Fake News Detection

Published: 23 Jan 2024, Last Modified: 23 May 2024TheWebConf24 OralEveryoneRevisionsBibTeX
Keywords: Multi-source Benchmark Dataset, Chinese Fake News Detection, Cross-source Evaluation, Multi-source Evaluation
TL;DR: A Multi-source Benchmark Dataset for Chinese Fake News Detection
Abstract: The prevalence of fake news across various online sources can have significant influence to the public. Existing Chinese fake news detection datasets are limited to the news sourced from Weibo solely. However, fake news that originates from multiple sources exhibits diversity across various aspects, including its content and social context. Methods trained on data from such a single news source can be hardly applicable to the real-world scenarios. Our pilot experiment demonstrates that the macro F1 score of the state-of-the-art method trained on the largest Chinese fake news detection dataset Weibo-21 so far, drops from 0.98 to 0.47 when changing the test data from Weibo-21 to multi-source data, failing to identify 35.34% of the multi-source fake news. To address this limitation, we construct the first multi-source benchmark dataset for Chinese fake news detection, termed MCFEND, which contains news collected from diverse sources, such as social platforms, messaging apps, and traditional online news outlets, and fact-checked through 14 authoritative fact-checking agencies. In addition, various established Chinese fake news detection methods are thoroughly evaluated on our proposed dataset, including the state-of-the-art approaches, in both the cross-source and multi-source scenarios. MCFEND contributes to the field of fake news detection by aiming at a benchmark to evaluate and advance Chinese fake news detection approaches in real-world scenarios.
Track: Web Mining and Content Analysis
Submission Guidelines Scope: Yes
Submission Guidelines Blind: Yes
Submission Guidelines Format: Yes
Submission Guidelines Limit: Yes
Submission Guidelines Authorship: Yes
Student Author: Yes
Submission Number: 422
Loading