Abstract: Fake news data, often sampled from the same communities, results in the veracity of news being highly correlated with certain textual and visual entities. This correlation leads fake news classification models to be prone to shortcut learning, quickly overfitting by capturing only shallow spurious correlations between labels and features. Consequently, neural networks trained on such data suffer from poor generalization and potential misclassification under distribution shifts. In this paper, we propose a **DI**sentanglement-based **C**ausality-awar**E** fake news detection method (DICE). DICE constructs multimodal news into a graph neural network and effectively models causal relationships between multimodal features and veracity labels through the use of node and edge mask disentanglers. To reinforce this disentanglement process, we designed a loss function aimed at minimizing extrapolation risk, which supervises the training and results in disentangled causal and biased representations of news. Extensive experiments demonstrate that DICE achieves superior performance on five large-scale fake news detection benchmarks. Additionally, our evaluation on a heavily biased fake news dataset demonstrates DICE's strong generalization, suggesting its potential to inform a new paradigm in causal fake news detection.
Paper Type: Long
Research Area: Information Retrieval and Text Mining
Research Area Keywords: Fake News Detection
Contribution Types: NLP engineering experiment, Data analysis
Languages Studied: English, Chinese
Submission Number: 1937
Loading