Fine-Grained Cross-Modal Graph Convolution for Multimodal Aspect-Oriented Sentiment Analysis

Published: 01 Jan 2023, Last Modified: 13 May 2025SMC 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Aspect-oriented multimodal sentiment analysis aims to identify the sentiment associated with a given aspect using text and image inputs. Existing methods have focused on the interaction between aspects, text, and images, achieving significant progress through cross-modal transformers. However, they still suffer from three problems: (1) Ignoring the dependency relationships between objects within the image modality; (2) Failing to consider the role of syntactic dependency relationships within the text modality in capturing aspect-related opinion words; (3) Neglecting the inherent dependency relationships between modalities. To address these issues, we propose a fine-grained cross-modal graph convolutional network model (FCGCN). Specifically, we construct intra-modality dependency relationships using syntactic and spatial relationships and fuse the two modalities through semantic similarity calculation. We then design a GCN-Attention layer to capture richer multimodal fusion information. Additionally, an aspect-oriented transformer module is introduced to capture aspect features interactively. Experimental results on the Twitter datasets show that our FCGCN model consistently outperforms state-of-the-art methods.
Loading