Infer Induced Sentiment of Comment Response to Video: A New Task, Dataset and Baseline

Published: 26 Sept 2024, Last Modified: 13 Nov 2024NeurIPS 2024 Track Datasets and Benchmarks PosterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0
Keywords: Multi-Modal; Micro Video; Sentiment Analysis; Emotion
Abstract: Existing video multi-modal sentiment analysis mainly focuses on the sentiment expression of people within the video, yet often neglects the induced sentiment of viewers while watching the videos. Induced sentiment of viewers is essential for inferring the public response to videos and has broad application in analyzing public societal sentiment, effectiveness of advertising and other areas. The micro videos and the related comments provide a rich application scenario for viewers’ induced sentiment analysis. In light of this, we introduces a novel research task, Multimodal Sentiment Analysis for Comment Response of Video Induced(MSA-CRVI), aims to infer opinions and emotions according to comments response to micro video. Meanwhile, we manually annotate a dataset named Comment Sentiment toward to Micro Video (CSMV) to support this research. It is the largest video multi-modal sentiment dataset in terms of scale and video duration to our knowledge, containing 107, 267 comments and 8, 210 micro videos with a video duration of 68.83 hours. To infer the induced sentiment of comment should leverage the video content, we propose the Video Content-aware Comment Sentiment Analysis (VC-CSA) method as a baseline to address the challenges inherent in this new task. Extensive experiments demonstrate that our method is showing significant improvements over other established baselines. We make the dataset and source code publicly available at https://github.com/IEIT-AGI/MSA-CRVI.
Supplementary Material: pdf
Flagged For Ethics Review: true
Submission Number: 369
Loading