Sifting Truth From Spectacle! A Multimodal Hindi Dataset for Misinformation Detection With Emotional Cues and Sentiments

Raghvendra Kumar, Pulkit Bansal, Raunak Kumar Singh, Sriparna Saha

Published: 2026, Last Modified: 23 Apr 2026IEEE Trans. Affect. Comput. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Misinformation poses a growing threat across media ecosystems, yet research on Hindi, one of the world’s most widely spoken languages, remains limited. We introduce a novel multimodal Hindi dataset of 6,544 article–image pairs to advance misinformation detection. Unlike existing English-centric and predominantly unimodal datasets, ours integrates text, images, and affective signals while being carefully cleaned of veracity cues to avoid artefact-driven inflation. Each sample is annotated with sentiment and emotions, making this the first Hindi resource with multimodal and affective dimensions. Through extensive experiments using IndicBART, IndicBERT, mBERT, and Vision Transformer models, we demonstrate the effectiveness of text–image fusion and affective features across multiple configurations. We also analyze the readability characteristics of genuine and misleading articles, providing insights into the linguistic patterns of Hindi misinformation. This dataset establishes a robust benchmark for multimodal misinformation detection and lays essential groundwork for research in Hindi and other low-resource languages.