Abstract: Multimodal Aspect-based Sentiment Analysis (MABSA) aims to extract aspect-sentiment pairs from a combination of text and images. However, images often contain content that is either irrelevant to the textual information or not related to the sentiment prediction, which can adversely affect the accuracy of the model predictions. Furthermore, existing models neglect precise regional information beyond global image features, which could also assist in enhancing aspect-based sentiment prediction, but may also introduce noise that damages the model. To address these issues, this study proposes Knowledge-injected Mixture-of-Prefix (KMoP) to inject various types of knowledge into the language model and reduce external noise. Specifically, external knowledge is injected in the form of prefixes into the language model, which minimizes catastrophic forgetting issue and generate noise-insensitive representations. Additionally, to allow different layers of the language model to automatically select the required knowledge, we differentiate the aggregation of prefixes from different knowledge sources for each layer through Mixture-of-Prefix. This paper simultaneously divides the training process into two parts, with the first phase training on the original clean dataset and the second phase fine-tuning on the original dataset with added noise. KMoP achieves state-of-the-art performance on MABSA task, with extensive supplementary experiments demonstrating its enhanced robustness to noise.
External IDs:dblp:conf/icmcs/WangFLZL25
Loading