Bottleneck-Constrained Contrastive Decoupled Network for Multimodal Aspect-based Sentiment Classification
Abstract: Multimodal aspect-based sentiment classification (MABSC) is a challenging task emerging in recent years, which aims to combine text and image to identify the sentiment polarity of each aspect. There exists a potential irrelevance between aspects and images, and mistakenly focusing on irrelevant image regions will introduce redundant and misalignment noise. Besides, existing methods implicitly mix visual and textual features, which may lead to the loss or blurring of modality-specific information. To address these challenges, we propose a Bottleneck-constrained Contrastive Decoupled Network (BCDN) for the MABSC task. We first design a bottleneck-constrained visual consistency module to reduce redundancy and misalignment noise in aspect-related visual features. Additionally, we employ modality decoupling to fully capture inter-modality knowledge. Specifically, we first decouple the aspect-related visual and textual representations into modality-invariant and modality-specific features. Afterwards, we propose novel contrastive regularizations to optimize the decoupled features. Extensive experiments on benchmark datasets demonstrate that our BCDN achieves superior performance and verify the effectiveness of our BCDN. The codes are released at https://github.com/ruiliu2020/BCDN.
Loading