Abstract: Memes, widely used in social media, serve both entertainment and communication purposes but can also contain offensive elements. Due to the vastness of internet content, automated techniques are necessary for categorizing and preventing the spread of offensive memes. This research explores the field of hateful meme identification, highlighting current limitations and the potential of combining vision and language models to utilize information from both image and text modalities. The proposed framework utilizes an ‘Image Captioning block’ to extract meaningful textual description from the input meme image, and subsequently a ‘Fusion and Classification block’ for combining the features from image modality and text modality as well as generating classification results separately from three transformers-based language models, and finally obtains a decision from an ensemble of the three predictions in the ‘Decision block’. We evaluate our proposed framework on the Hateful Memes Challenge Dataset, obtaining an accuracy of 72.2% and an AUROC score of 0.7708. In addition, we provide a comprehensive analysis regarding the characteristics of the memes that may introduce difficulty in accurately classifying them, which in turn provides valuable insight to future researchers by explaining how certain kinds of memes are misclassified by the models.
Loading