MemeCourt: a VLMs-Based Court with Multi-Agent Collaboration Framework for Harmful Meme Detection

ACL ARR 2025 May Submission3017 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Internet harmful memes have become pervasive across social media owing to their visual appeal and satirical content. Unlike direct hateful images or text, multimodal memes often embed implicit content, making their detection a novel challenge. Despite extensive exploration of existing methods for harmful meme detection, most lack interpretability in judgments. To bridge this gap, we introduce MemeCourt: a Vision-Language Models (VLMs-) Based Court with Multi-Agent Collaboration Framework. MemeCourt enhances the extraction of implicit features through a reasoning pipeline: the Proposer-Agent engages in multi-round interactive questioning with the Accuser-agent and Defender-Agent, who each generate evidence supporting their respective stances. The Judge-Agent then integrates these evidences with precedent cases to make a final judgment on meme harmfulness. Experiments on three publicly available meme datasets demonstrate that our approach achieves SOTA performance, and improves interpretability by tracing the explicit judging process.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: hate-speech detection
Contribution Types: Model analysis & interpretability
Languages Studied: English
Keywords: hate-speech detection
Submission Number: 3017
Loading