Aspect-Based Multimodal Mining: Unveiling Sentiments, Complaints, and Beyond in User-Generated Content

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Sentiment analysis and complaint identification are key tools in mining user preferences by measuring the polarity and breach of expectations. Recent works on complaint identification identify aspect categories and classify them into complaint or not-complaint classes. However, aspect category-based complaint identification provides high-level information about the features of products. In addition, it is also observed that the user sometimes does not complain about a specific aspect but expresses concern about specific aspects in a respectful way. Currently, uni-modal and multimodal studies do not differentiate between this thin line between complaint and concern. In this work, we propose the task of multimodal aspect term-based analysis beyond sentiments and complaints. It comprises of two sub-tasks, \textit{viz} (i) classification of the given aspect term into one of the four classes, \textit{viz.} praise, concern, complaint, and others, (ii) identification of the cause of praise, concern, and complaint classes. We propose a first benchmark explainable multimodal corpus annotated for aspect term-based complaints, praises, concerns, their corresponding causes, and sentiments. Further, we propose an effective technique for the joint learning of aspect term-based complaint/concern/praise identification and cause extraction tasks (primary tasks) where sentiment analysis is used as a secondary task to assist primary tasks and establish them as baselines for further research in this direction. Sample dataset has been made available at: \url{https://anonymous.4open.science/r/MAspectX-327E/README.md} The whole dataset will be made publicly available for research after acceptance of the paper.
Primary Subject Area: [Experience] Multimedia Applications
Secondary Subject Area: [Content] Vision and Language
Relevance To Conference: This work significantly advances multimedia/multimodal processing by introducing a novel approach to analyze user feedback beyond mere sentiment polarity. By integrating textual and image modality, it addresses the limitations of existing uni-modal and multimodal studies in differentiating between complaints, concerns, and praises at a fine-grained aspect term level. Through the proposed classification and cause identification tasks, it enables a deeper understanding of user preferences, thereby facilitating more nuanced product or service improvements. Furthermore, the creation of a benchmark multimodal corpus lays the foundation for future research in this area.
Submission Number: 5730
Loading