Abstract: Automatic personality detection has evolved from simple text classification to sophisticated multimodal analysis, recognizing the multidimensional manifestation of personality beyond textual data. This shift highlights the need for datasets that can accurately capture the complexity of human personality through diverse modalities. We introduce the PersonaMovs, a large, extensive and varied multimedia conversational dataset, built on 305 movies and 14 TV series, featuring over 46k dialogues, 552k utterances, 4016 characters, and 963 hours of video. PersonaMovs not only addresses the challenges of existing datasets by offering majority-voted personality annotations and detailed social relation networks but also paves the way for advanced analysis of interactions of personality across various contexts.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: human behavior analysis, NLP tools for social analysis
Languages Studied: English
Submission Number: 5066
Loading