Mitigating World Biases: A Multimodal Multi-View Debiasing Framework for Fake News Video Detection

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Short videos turn into an important channel for public sharing, as well as they've become a fertile ground for fake news. Fake news video detection is to judge the veracity of news based on its different modal information, such as video, audio, text, image and social context information. Current detection models tend to learn the multimodal dataset biases within spurious correlations between news modalities and veracity labels as shortcuts, rather than learning how to integrate the multimodal information behind them to reason, resulting in seriously degrading their detection and generalization capabilities. To address this issues, we propose a Multimodal Multi-View Debiasing (MMVD) framework, which makes the first attempt to mitigate various multimodal biases for fake news video detection. Inspired by people's misleading situations by multimodal short videos, we summarize three cognitive biases: static, dynamic, and social biases. MMVD put forward a multi-view causal reasoning strategy to learn unbiased dependencies within the cognitive biases, thus enhancing the unbiased prediction of multimodal videos. The extensive experimental results show that the MMVD could improve the detection performance of multimodal fake news video. Studies also confirm that our MMVD can mitigate multiple biases on complex real-world scenarios and improve generalization ability of multimodal models.
Primary Subject Area: [Experience] Multimedia Applications
Secondary Subject Area: [Content] Multimodal Fusion
Relevance To Conference: Our work focuses on novel multimodal fake news video detection on short video platforms. Short videos turn into an important channel for public sharing, as well as they've become a fertile ground for fake news. Fake news video detection is to judge the veracity of news based on its different modal information, such as video, audio, text, image and social context information. Current detection models tend to learn the multimodal dataset biases within spurious correlations between news modalities and veracity labels as shortcuts, rather than learning how to integrate the multimodal information behind them to reason, resulting in seriously degrading their detection and generalization capabilities. Our work focuses not only on detecting fake news using multimodal information, but also on mitigating bias during multimodal fusion.
Supplementary Material: zip
Submission Number: 5507
Loading