Multimodal and Multilingual Fact-Checked Article Retrieval

Stefanos-Iordanis Papadopoulos, Ivana Benová, Sebastian Kula, Michal Gregor, George Karantaidis, Tomas Javurek, Marián Simko, Symeon Papadopoulos

Published: 2025, Last Modified: 24 Feb 2026ICMR 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Fact-Check Retrieval (FCR) plays a crucial role in automated fact-checking by retrieving relevant fact-checked articles for disputed claims. While recent work has explored text-based, multilingual, and multimodal FCR, most efforts remain unimodal or limited to English. To bridge this gap, we introduce M3-Check, the first FCR dataset combining multilingual texts and images from social media posts with fact-check articles from diverse, credible sources. Furthermore, we introduce FACTOR a two-tower Transformer-based architecture that employs cross-tower parameter sharing and modality-wise aligned weight initialization; that outperforms zero-shot baselines, two-tower linear models, and vanilla Transformers, achieving a 17% improvement over the latter. Moreover we conduct modality ablations and compare state-of-the-art encoders, showing that multilingual encoders like multi-E5 can provide an additional 13% in performance without requiring English translations.

External IDs:dblp:conf/mir/PapadopoulosBKG25