Abstract: Stance detection is a task aimed at identifying and analyzing the author's stance from text. Previous studies have primarily focused on the text, which may not fully capture the implicit stance conveyed by the author. To address this limitation, we propose a novel approach that transforms original texts into artificially generated images and uses the visual representation to enhance stance detection. Our approach first employs a text-to-image model to generate candidate images for each text. These images are carefully crafted to adhere to three specific criteria: textual relevance, target consistency, and stance consistency. Next, we introduce a comprehensive evaluation framework to select the optimal image for each text from its generated candidates. Subsequently, we introduce a multimodal stance detection model that leverages both the original textual content and the generated image to identify the author's stance. Experiments demonstrate the effectiveness of our approach and highlight the importance of artificially generated images for stance detection.
Paper Type: Long
Research Area: Sentiment Analysis, Stylistic Analysis, and Argument Mining
Research Area Keywords: Stance Detection, Artificial Image Generation, Multimodal Learning
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 4079
Loading