Generating Multimodal Metaphorical Features for Meme Understanding

Published: 20 Jul 2024, Last Modified: 06 Aug 2024MM2024 OralEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Understanding a meme is a challenging task, due to the metaphorical information contained in the meme that requires intricate interpretation to grasp its intended meaning fully. In previous works, attempts have been made to facilitate computational understanding of memes through introducing human-annotated metaphors as extra input features into machine learning models. However, these approaches mainly focus on formulating linguistic representation of a metaphor (extracted from the texts appearing in memes), while ignoring the connection between the metaphor and corresponding visual features (e.g., objects in meme images). In this paper, we argue that a more comprehensive understanding of memes can only be achieved through a joint modelling of both visual and linguistic features of memes. To this end, we propose an approach to generate Multimodal Metaphorical feature for Meme Classification, named MMMC. MMMC derives visual characteristics from linguistic attributes of metaphorical concepts, which more effectively convey the underlying metaphorical concept, leveraging a text-conditioned generative adversarial network. The linguistic and visual features are then integrated into a set of multimodal metaphorical features for classification purpose. We perform extensive experiments on a benchmark metaphorical meme dataset, MET-Meme. Experimental results show that MMMC significantly outperforms existing baselines on the task of emotion classification and intention detection. Our code and dataset are available at https://anonymous.4open.science/r/MMMC-C37B.
Primary Subject Area: [Content] Media Interpretation
Secondary Subject Area: [Content] Multimodal Fusion, [Content] Vision and Language
Relevance To Conference: “Memes” are a prevalent form of multimedia content characterized by the integration of visual imagery and textual elements. This research significantly contributes to the domain of multimedia/multimodal processing by introducing an innovative approach aimed at enhancing meme understanding. The proposed methodology, known as MMMC (Multimodal Metaphorical feature for Meme Classification), breaks new ground in generating multimodal metaphorical features to enhance the understanding of meme content. MMMC leverages Generative Adversarial Networks (GANs) to synthesize visual representations of metaphorical concepts, thereby facilitating the conveyance of metaphor information. These visual and textual metaphorical features are integrated into a unified multimodal framework, significantly enhancing the performance of meme classification in empirical assessments. The study underscores the crucial role of visual-verbal integration in the decoding of multimodal metaphors and provides a novel framework for the comprehensive analysis of meme semantics.
Submission Number: 2352
Loading