Abstract: The explosive growth of multimedia has attracted many people to express their opinions through social media like Flickr and Facebook. As a result, social media has become the rich source of data for analyzing human emotions. Many earlier studies have been conducted to automatically assess human emotions due to their wide range of applications such as education, advertisement, and entertainment. Recently, many researchers have been focusing on visual contents to find out clues for evoking emotions. In literature, this type of study is called visual sentiment analysis. Although a great performance has been achieved by many earlier studies on visual emotion analysis, most of them are limited to classification tasks with pre-determined emotion categories. In this paper, we aim to recognize emotion classes that do not exist in the training set. The proposed model is trained by mapping the visual features to the emotional semantic representation embedded by the BERT language model. By evaluating the model on a cross-domain affective dataset, we achieved 66% accuracy for predicting the unseen emotions not included in the training set.
0 Replies
Loading