Abstract: Detecting rumors on social media has become a critical research challenge.
Although existing multimodal rumor detection methods have achieved promising results, they still suffer from insufficient utilization of modality-specific information and inadequate cross-modal interaction.
To address these limitations, we propose a novel Cross-Modal Consistency Enhancement (CME) model for multimodal rumor detection.
It incorporates textual, visual, and propagation modalities into a unified framework and transforms each modality into a graph.
The uncertainties of the three modalities are utilized to guide modality reconstruction.
We design a modality alignment module, including feature alignment and structure alignment to improve the consistency of cross-modal representations.
In the process of feature alignment, the aligned modality representations are used as a teacher in a graph-guided self-distillation module to supervise each unimodal student representation.
Structure alignment is introduced to model structural similarities across modalities.
Extensive experiments conducted on two public real-world datasets demonstrate that our CME model achieves significant improvements compared with the state-of-the-art baselines.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: rumor detection
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: English, Chinese
Submission Number: 7547
Loading