A multimodal fusion network with attention mechanisms for visual-textual sentiment analysis

Chenquan Gan, Xiang Fu, Qingdong Feng, Qingyi Zhu, Yang Cao, Ye Zhu

Published: 2024, Last Modified: 09 Oct 2025Expert Syst. Appl. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•Proposing a multimodal fusion network for effective visual–textual sentiment analysis.•The proposed method can eliminate the heterogeneity of visual and textual features.•Attention mechanisms are used to minimize noise interference.•Correlations between local region feature representations are leveraged.•Extensive experiments show a new state-of-the-art performance.

External IDs:dblp:journals/eswa/GanFFZCZ24