Learning in Order! A Sequential Strategy to Learn Invariant Features for Multimodal Sentiment Analysis
Abstract: This work proposes a novel and simple sequential learning strategy to train models on videos and texts for multimodal sentiment analysis. To estimate sentiment polarities on unseen out-of-distribution data, we introduce a multimodal model that is trained either in a single source domain or multiple source domains using our learning strategy. This strategy starts with learning domain invariant features in text, followed by learning sparse domain-agnostic features in videos, assisted by the selected features learned in text. Our experimental results demonstrate that our model achieves significantly superior performance than the state-of-the-art approaches in both single-source and multi-source settings. Our feature selection procedure favors the features that are independent to each other and are strongly correlated with their polarity labels. To facilitate research on this topic, the source code of this work will be publicly available upon acceptance.
Primary Subject Area: [Engagement] Emotional and Social Signals
Relevance To Conference: The paper designs a simple yet effective multimodal learning strategy that learns cross-domain invariant features for multimodal sentiment analysis. In addition to its excellent performance, it also provides insights into the correlation among domain-invariant features in multimodal settings.
Supplementary Material: zip
Submission Number: 2340
Loading