Multimodal Stacked Cross Attention Network for Fine-Grained Fake News Detection

Published: 01 Jan 2023, Last Modified: 19 Feb 2025ICME 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Fake news is usually disseminated in a multimodal form, which incorporates natural language, visual language, and so on. Therefore, many deep learning approaches are proposed to detect multimodal fake news. However, a drawback of existing methods is that they simply fuse unimodal features and ignore the latent semantic alignment of image and text modalities. In this paper, we propose a novel Multimodal Stacked Cross Attention Network (MSCA) to better align and fuse multimodal token-level textual and visual features for fake news detection. Experiments conducted on two publicly available datasets show that our method can significantly improve performance compared with other models. Furthermore, experimental analysis shows that MSCA can effectively align and fuse token-level features of multiple modalities.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview