In this paper, we present a comparative study of text sentiment classification models using term frequency inverse document frequency vectorization in both supervised machine learning and lexicon-based techniques. There have been multiple promising machine learning and lexicon-based techniques, but the relative goodness of each approach on specific types of problems is not well understood. In order to offer researchers comprehensive insights, we compare a total of six algorithms to each other. The three machine learning algorithms are: Logistic Regression (LR), Support Vector Machine (SVM), and Gradient Boosting. The three lexicon-based algorithms are: Valence Aware Dictionary and Sentiment Reasoner (VADER), Pattern, and SentiWordNet. The underlying dataset consists of Amazon consumer reviews. For performance measures, we use accuracy, precision, recall, and F1-score. Our experiments’ results show that all three machine learning models outperform the lexicon-based models on all the metrics. SVM, Gradient Boosting, and LR models have accuracy of 89%, 87%, and 90%; precision of 90%, 88%, and 91%; recall of 98%, 98%, and 97%; F1-score of 94%, 92%, and 94%, respectively. Pattern, VADER, and SentiWordNet models have accuracy of 69%, 83%, and 80%; recall of 72%, 89%, and 88%, precision of 88%, 90%, and 90%; F1-score of 79%, 89%, and 88%, respectively. Our machine learning results are slightly better compared to recent text sentiment machine learning works while our lexicon-based result are worse compared to recent similar lexicon-based works

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License