Abstract: There has been an upward trend of fake news propagation on social media. To solve the fake news propagation problem, it is crucial to understand which media posts (e.g., tweets) cause fake news to disseminate widely, and further what lexicons inside a tweet play essential roles for the propagation. However, only modeling the correlation between social media posts and dissemination will find a spurious relationship between them, provide imprecise dissemination prediction, and incorrect important lexicons identification because it did not eliminate the effect of the confounder variable. Additionally, existing causal inference models cannot handle numerical and textual covariates simultaneously. Thus, we propose a novel causal inference model that combines the textual and numerical covariates through soft-prompt learning, and removes irrelevant information from the covariates by conditional treatment generation toward learning effective confounder representation. Then, the model identifies critical lexicons through a post-hoc explanation method. Our model achieves the best performance against baseline methods on two fake news benchmark datasets in terms of dissemination prediction and important lexicon identification related to the dissemination. The code is available at https://github.com/bigheiniu/CausalFakeNews.
Loading