SMU Data Science Review

Comparative Study of Sentiment Analysis with Product Reviews Using Machine Learning and Lexicon-Based Approaches

Heidi Nguyen, Southern Methodist UniversityFollow
Aravind Veluchamy, Southern Methodist UniversityFollow
Mamadou Diop, Southern Methodist UniversityFollow
Rashed Iqbal, Ras Al Khaimah AcademyFollow

Abstract

In this paper, we present a comparative study of text sentiment classification models using term frequency inverse document frequency vectorization in both supervised machine learning and lexicon-based techniques. There have been multiple promising machine learning and lexicon-based techniques, but the relative goodness of each approach on specific types of problems is not well understood. In order to offer researchers comprehensive insights, we compare a total of six algorithms to each other. The three machine learning algorithms are: Logistic Regression (LR), Support Vector Machine (SVM), and Gradient Boosting. The three lexicon-based algorithms are: Valence Aware Dictionary and Sentiment Reasoner (VADER), Pattern, and SentiWordNet. The underlying dataset consists of Amazon consumer reviews. For performance measures, we use accuracy, precision, recall, and F1-score. Our experiments’ results show that all three machine learning models outperform the lexicon-based models on all the metrics. SVM, Gradient Boosting, and LR models have accuracy of 89%, 87%, and 90%; precision of 90%, 88%, and 91%; recall of 98%, 98%, and 97%; F1-score of 94%, 92%, and 94%, respectively. Pattern, VADER, and SentiWordNet models have accuracy of 69%, 83%, and 80%; recall of 72%, 89%, and 88%, precision of 88%, 90%, and 90%; F1-score of 79%, 89%, and 88%, respectively. Our machine learning results are slightly better compared to recent text sentiment machine learning works while our lexicon-based result are worse compared to recent similar lexicon-based works

Recommended Citation

Nguyen, Heidi; Veluchamy, Aravind; Diop, Mamadou; and Iqbal, Rashed (2018) "Comparative Study of Sentiment Analysis with Product Reviews Using Machine Learning and Lexicon-Based Approaches," SMU Data Science Review: Vol. 1: No. 4, Article 7.
Available at: https://scholar.smu.edu/datasciencereview/vol1/iss4/7