HAGCN-BERT: Hierarchical Attention Graph Convolution Network with In-Domain Adapted Self-Supervised Pretraining with Hierarchical Graph Convolution for Multimodal Sentiment Analysis
Abstract: Introducing a multimodal sentimental analysis model, which is designed to overcome the previous state of our limitation of conventional text only or insufficient fusion approaches. Traditional pretrained language model like BERT, but when applied in sentiment analysis, it fell short to recognize the subtle emotions, unique vocabularies and etc. To address these problem, this study proposes a self-supervised pre-trained framework, which has combined standard mask language model with innovative marking strategy, resulting in domains specific pre-training model and introducing Hierarchical Attention Graph Convolution Network, which captures both fine grained interaction and global content and by merging the features across different modalities, also by suppressing redundant noises, resulting in efficient using of multimodal data. Further this model has adapted a unified multimodal fusion network that seamlessly integrated text, visuals and audio features. To analyze the model, social media-based dataset such CMU-MOSEI and CMU-MOSI are considered. The outcome illustrated a mean increase of about 1.5 points in the major performance indicators like mean absolute error and notable increase in Pearson's correlation, when in comparison with the latest models. These results confirm that the suggested framework with its improved pre-training and effective multimodal fusion is resistant to capturing intricate sentimental signals in professional fields.
Loading