MemeCourt: a VLMs-Based Court with Multi-Agent Collaboration Framework for Harmful Meme Detection

MemeCourt: a VLMs-Based Court with Multi-Agent Collaboration Framework for Harmful Meme Detection

ACL ARR 2025 May Submission3017 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Internet harmful memes have become pervasive across social media owing to their visual appeal and satirical content. Unlike direct hateful images or text, multimodal memes often embed implicit content, making their detection a novel challenge. Despite extensive exploration of existing methods for harmful meme detection, most lack interpretability in judgments. To bridge this gap, we introduce MemeCourt: a Vision-Language Models (VLMs-) Based Court with Multi-Agent Collaboration Framework. MemeCourt enhances the extraction of implicit features through a reasoning pipeline: the Proposer-Agent engages in multi-round interactive questioning with the Accuser-agent and Defender-Agent, who each generate evidence supporting their respective stances. The Judge-Agent then integrates these evidences with precedent cases to make a final judgment on meme harmfulness. Experiments on three publicly available meme datasets demonstrate that our approach achieves SOTA performance, and improves interpretability by tracing the explicit judging process.

Paper Type: Long

Research Area: Computational Social Science and Cultural Analytics

Research Area Keywords: hate-speech detection

Contribution Types: Model analysis & interpretability

Languages Studied: English

Keywords: hate-speech detection

Submission Number: 3017

Loading