SMU Data Science Review

Yelp’s Review Filtering Algorithm

Yao Yao, Southern Methodist UniversityFollow
Ivelin Angelov, Southern Methodist UniversityFollow
Jack Rasmus-Vorrath, Southern Methodist UniversityFollow
Mooyoung Lee, Southern Methodist UniversityFollow
Daniel W. Engels, Southern Methodist UniversityFollow

Abstract

In this paper, we present an analysis of features influencing Yelp's proprietary review filtering algorithm. Classifying or misclassifying reviews as recommended or non-recommended affects average ratings, consumer decisions, and ultimately, business revenue. Our analysis involves systematically sampling and scraping Yelp restaurant reviews. Features are extracted from review metadata and engineered from metrics and scores generated using text classifiers and sentiment analysis. The coefficients of a multivariate logistic regression model were interpreted as quantifications of the relative importance of features in classifying reviews as recommended or non-recommended. The model classified review recommendations with an accuracy of 78%. We found that reviews were most likely to be recommended when conveying an overall positive message written in a few moderately complex sentences expressing substantive detail with an informative range of varied sentiment. Other factors relating to patterns and frequency of platform use also bear strongly on review recommendations. Though not without important ethical implications, the findings are logically consistent with Yelp’s efforts to facilitate, inform, and empower consumer decisions.

Recommended Citation

Yao, Yao; Angelov, Ivelin; Rasmus-Vorrath, Jack; Lee, Mooyoung; and Engels, Daniel W. (2018) "Yelp’s Review Filtering Algorithm," SMU Data Science Review: Vol. 1: No. 3, Article 3.
Available at: https://scholar.smu.edu/datasciencereview/vol1/iss3/3