Abstract: A templatic meme possesses a base semantics that can be tailored by whomever posts it on social media. Machine learning systems that treat memes as just images with text struggle to be performant, which is likely due to such systems having insufficient context. There can be more to memes than the obvious image and text. To aid understanding of memes, we release a knowledge base of memes, composed of more than 5,200 meme templates, detailed information about each one, and 54,000 examples of template instances (templatic memes). To demonstrate the semantic signal of meme templates, we formulate a majority-based, non-parametric classifier that leverages our knowledge base. Our method outperforms more expensive techniques but exposes an underlying issue with meme datasets, where template information is leaked from the training data and models can exploit this knowledge in a way we may not want them to. To control the impact of this template awareness, we reorganize datasets to account for the influence of meme templates. Our re-split datasets discourage undesirable shortcuts to meme understanding, resulting in increased model robustness. This work sets the state-of-the-art for five of the six tasks that we consider.
Paper Type: long
Research Area: Resources and Evaluation
Contribution Types: Model analysis & interpretability, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models, Data resources, Data analysis, Position papers
Languages Studied: English, Hindi
0 Replies
Loading