Multimodal Sarcasm Detection Based on Multimodal Sentiment Co-training

Yi Liu, Zengwei Zheng, Binbin Zhou, Jianhua Ma, Lin Sun, Ruichen Xia

Published: 01 Jan 2022, Last Modified: 19 Feb 2025SmartWorld/UIC/ScalCom/DigitalTwin/PriComp/Meta 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Sarcasm detection is a difficult task in sentiment analysis because sarcasm often includes both positive and negative sentiments, making it difficult to identify. In recent years, visual information has been used to study sarcasm in social media data. Based on sentiment contrast in image and text, this paper proposes a Multimodal Sentiment and Sarcasm Gradient Co-training (MSSGC) model. The model uses text and image feature sharing networks to explicitly learn image and text sentimental features from image and text sentiment datasets and integrates a cross-modal fusion module for Multimodal Sarcasm Detection (MSD). The training algorithm uses the sentimental features for sarcasm detection by weighting the sentiment and sarcasm classification gradients. Extensive experiments, including case studies, are performed to evaluate the MSSGC model. The results illustrate that the proposed model outperforms recent MSD models. The code is available at: https://github.com/vantree/MSSGC.