Sentiment Analysis on Tweets Using Machine Learning and Combinatorial Fusion

James Ho, Dominik Ondusko, Brandon Roy, D. Frank Hsu

Published: 2019, Last Modified: 24 Feb 2025DASC/PiCom/DataCom/CyberSciTech 2019EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Sentiment analysis using social network platforms such as twitter has achieved tremendous results. However, due to its imbalanced data content and semantic context, it remains a challenge to give a full and effective sentiment labeling. In this paper, we propose a two-stage data analytic approach consisting of machine learning algorithms and combinatorial fusion. The first stage uses five machine learning algorithms: logistic regression, naive Bayes, perceptron, random forest, and support vector machine (SVM). Combinatorial fusion is then used to combine subset of these five algorithms. We conduct our investigation using a Kaggle dataset to classify each of the tweets as positive, neutral, or negative sentiment. We demonstrate that although most of the machine learning algorithms perform well, combination of these algorithms with higher performance ratio and cognitive diversity can perform even better.