SEHP: stacking-based ensemble learning on novel features for review helpfulness prediction

Muhammad Shahid Iqbal Malik, Aftab Nawaz

Published: 2024, Last Modified: 02 Aug 2024Knowl. Inf. Syst. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The review’s helpfulness and its impact on purchase decisions are well established. This study presents a robust helpfulness prediction model for customer reviews. To this end, significant review textual features and newly defined reviewer characteristics are explored with a stacking-based ensemble model. More specifically, stylistic, time complexity, summary language, psychological, and linguistics features are introduced. According to our knowledge, these features are not explored earlier with the stacking-based ensemble model for review helpfulness prediction. The proposed predictive model is evaluated on three benchmark Amazon review datasets, consisting of 200,979 reviews in total. Two algorithms are proposed to help readers for understanding the methodology and researchers to regenerate the results. We compared several machine-learning, stacking-based ensemble, and 1-dimenional convolutional neural network (1D CNN) models. The stacking-based ensemble model shows benchmark performance by obtaining 0.009 mean square error with a hybrid combination of the proposed (reviewer and textual) features. Moreover, the proposed model outperformed five baselines including the fine-tuned BERT (Bidirectional Encoder Representations from Transformers) model by reducing mean square error by 40%. The results show that review textual features are better predictors than reviewer features as a standalone model. The findings of this article have significant implications for the researchers and the business owners.