Counterfactual Multimodal Fact-Checking Method Based on Causal Intervention

Qing Zhang

Published: 19 Oct 2024, Last Modified: 07 Feb 2025Pattern Recognition and Computer Vision (PRCV 2024)EveryoneCC BY 4.0

Abstract: In previous fact-checking methods, the common practice was to first employ cross-modal cross-retrieval to obtain relevant textual and image evidence, and then input the retrieved evidence into the model for verification. However, these methods assumed that the retrieved evidence was always valid and did not consider the possibility of spurious correlations within the evidence information. For example, in visual evidence, objects unrelated to textual form can be erroneously linked to the answer through simple matching with image information. To address this issue, we propose a new approach that utilizes causal tools to detect the fine-grained supporting parts of the evidence information that are causally related to the information to be verified, thus removing potentially misleading parts with spurious correlations that could affect the model’s judgments. Furthermore, we have also observed the issue that models frequently rely on shortcut reasoning, making judgments about the truth or falsehood of information based solely on a single modality of information. To overcome this limitation, we propose the Counterfactual reasoning module. This method is built on explicit modeling of causal relationships: the direct causal effects of shortcut reasoning and the genuine causal effects of multimodal reasoning come from the total causal effect. By utilizing causal graphs and employing counterfactual reasoning, we separate shortcut reasoning from the total causal effect, enabling the model to learn true multimodal fact-checking techniques rather than relying on shortcut reasoning to arrive at answers. Through extensive experiments on the NewsCLIPpings and VERITE benchmark datasets, we have found that this method has made significant progress in fact-checking tasks, which provides a new perspective from casual reasoning for multimodal fact-checking. To the best of our knowledge, our method is the first to integrate causal methods into fact-checking tasks.