Blank-filling: Missing Modality-Simulated Network for Robust Multimodal Fact Verification

Blank-filling: Missing Modality-Simulated Network for Robust Multimodal Fact Verification

ACL ARR 2024 December Submission1348 Authors

16 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Recently, multimodal fact verification tasks aim to assess the truthfulness of multimodal claims by the retrieved evidence through textual and visual content. In contrast, the multimodal information may be incomplete in original posts or missing during the data collection. However, recent missing-modality studies still cannot properly handle the above complex missing situations of claim-evidence input pairs in multimodal fact verification, as they fail to capture complicated relations between claims and evidence. To solve these problems, we propose a novel model named Missing Modality-Simulated Network (MMSN) for more robust and adaptive multimodal fact verification. We design a novel dual-channel soft simulation module to use both cross-modal information and claim-evidence correlations to simulate missing features with a soft-weighted method. Besides, MMSN exploits fine-grained textual key information and designs coarse-grained and fine-grained fusions to fuse multimodal information and capture their interactions exhaustively. The experimental results on three real-world public datasets show the superiority and effectiveness of MMSN for robust multimodal fact verification.

Paper Type: Long

Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond

Research Area Keywords: multimodality, cross-modal application

Languages Studied: English

Submission Number: 1348

Loading