Abstract: With the emergence of social media and e-commerce platforms, accurate \emph{user profiling} has become increasingly vital for recommendation systems and personalized services.
Recent studies have focused on generating detailed user profiles by extracting various aspects of user attributes from textual reviews. Nevertheless, these investigations have not fully exploited the potential of the abundant multimodal data at hand.
In this study, we propose a novel task called \emph{multimodal user profiling}. This task emphasizes the utilization of both review texts and their accompanying images to create comprehensive user profiles. By integrating textual and visual data, we leverage their complementary strengths, enabling the generation of more holistic user representations.
Additionally, we explore a unified joint training framework with various multimodal training strategies that incorporate users' historical review texts and images for user profile generation.
Our experimental results underscore the significance of multimodal data in enhancing user profile generation and demonstrate the effectiveness of the proposed unified joint training approach.
Paper Type: Long
Research Area: Sentiment Analysis, Stylistic Analysis, and Argument Mining
Research Area Keywords: Sentiment Analysis
Languages Studied: English
Submission Number: 4393
Loading