Keywords: Harmful Content Detection, Meme Understanding
Abstract: Harmful memes pose a distinct challenge for content moderation: they compress social meaning into a visual template with short text, often relying on irony, intertextuality, and local cultural context. Current multimodal safety datasets either focus on narrow harm definitions or mainly English meme ecosystems, limiting progress on robust, culturally grounded detection. We introduce CHARMemes, a large-scale benchmark for harm-aware meme understanding in the Russian-oriented online ecosystem (Runet). The dataset contains 23,025 in-the-wild memes collected from a template-centric public source and spans 2007–2025, enabling temporal and cross-domain analyses. We propose a taxonomy of 11 harm categories grouped into three severity buckets, and build the dataset with a scalable pipeline combining automated labeling, targeted human review, and multi-stage deduplication.
We benchmark multiple vision–language approaches under binary, severity-level, and fine-grained settings, and evaluate robustness across time and datasets. Results indicate that binary harmfulness detection is comparatively strong, while fine-grained harm recognition remains difficult, especially under temporal shift. CHARMemes offers a realistic testbed for developing more robust harmful meme detectors.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: corpus creation, multilingual corpora
Contribution Types: Data resources
Languages Studied: Russian, English
Submission Number: 5382
Loading