Multi-Task Network Guided Multimodal Fusion for Fake News Detection

Jinke Ma, Liyuan Zhang, Yong Liu, Wei Zhang

Published: 2024, Last Modified: 11 Apr 2025ACML 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Fake news detection has become a hot research topic in the multimodal domain. Existing multimodal fake news detection research utilizes a series of feature fusion networks to gather useful information from different modalities of news posts. However, how to form effective cross-modal features? And how cross-modal correlations impact decision-making? These remain open questions. This paper introduces MMFND, a multi-task guided multimodal fusion framework for fake news detection , which introduces multi-task modules for feature refinement and fusion. Pairwise CLIP encoders are used to extract modality-aligned deep representations, enabling accurate measurement of cross-modal correlations. Enhancing feature fusion by weighting multimodal features with normalised cross-modal correlations. Extensive experiments on typical fake news datasets demonstrate that MMFND outperforms state-of-the-art approaches.