Recognizing themes in Amazon reviews through Unsupervised Multi-Document Summarization

13 Jan 2022OpenReview Archive Direct UploadReaders: Everyone
Abstract: Amazon now has an overwhelming number for reviews for popular products, and shoppers are routinely spending a lot of time reading through hundreds or thousands of reviews while making mental notes of important themes. We propose a system for unsupervised extractive summarization using deep learning feature extractors combined with several different models: k-means, affinity propagation, DBSCAN, and PageRank. We also propose a system for unsupervised abstractive summarization using a deep learning model. These summarization models capture the major themes in all the reviews of a product and optionally report the number of users who mentioned the same theme, hence giving shoppers a sense of salience. Since summarization in an unsupervised learning is notoriously difficult to evaluate, we propose a suite of metrics: ROUGE-1, semantic similarity, sentiment accuracy, attribute match, and content preservation, as ways to measure the quality of the summary. We found that extractive summarization is able to capture important themes, though sometimes the exemplar sentence is not well chosen. We found that abstractive summarization is able to generate human-readable text, though content preservation is challenging. Our baseline of k-means clustering scored a 3.48 and 3.44 for attribute match and content preservation, respectively and we achieved 3.72 and 3.93, respectively, with PageRank.
0 Replies

Loading