AraHarmMeme: A Multimodal Benchmark for Harmful Meme Detection in Arabic

AraHarmMeme: A Multimodal Benchmark for Harmful Meme Detection in Arabic

ACL ARR 2026 January Submission10116 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Harmful Meme Detection, Deep Learning, Multimodal AI, Vision–Language Models, Content Moderation, Dialectal Arabic

Abstract: In this work, we introduce AraHarMeme, a benchmark dataset for harmful content detection in Arabic memes. The dataset consists of 5,313 Arabic social media images collected from diverse platforms and manually annotated into three classes: harmful memes, non-harmful memes, and non-meme content. To the best of our knowledge, AraHarMeme is the first Arabic multimodal benchmark that explicitly addresses harmful meme detection. We conduct experiments using text-only, image-only, supervised multimodal models, as well as large vision-language models evaluated under zero-shot settings. The results show that Arabic-focused text models consistently outperform multilingual baselines, while image-only models are insufficient for reliable harmful content detection. Supervised multimodal models yield improvements when strong Arabic text encoders are combined with appropriate visual backbones, whereas zero-shot large vision-language models remain considerably less effective than supervised approaches on this dataset. The dataset and evaluation protocol will be publicly released to support future research in Arabic multimodal safety and content moderation.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: Multimodal NLP, Dataset and Benchmark Construction, Arabic NLP, Harmful Content Detection, Evaluation and Analysis

Contribution Types: NLP engineering experiment, Data resources, Data analysis

Languages Studied: Arabic

Submission Number: 10116

Loading