Detection of review spam in online review sites like Yelp.com using machine learning algorithms
Summary: Opinions from online digital media are increasingly used by individuals and organizations for making purchase decisions and marketing and product design. Positive opinions often mean profits and fames for businesses and individuals. This is a strong incentives for people to game the system and manipulate user sentiment by posting fake opinions or reviews to promote or to discredit some target products. Such indivduals are called opinion spammers and their activities are called opinion spamming.
Dataset: From Yelp.com
Goal:
Devise techniques to detect with certain degree of certaininty which reviews are spams, so that the online review website can take proper action.
Methodology: Used supervised as well as unsupervised learning algorithms to measure performance of classification. Feature selection used to find well performing features.
Algorithms used: SVM(Linear Kernel) Naive Bayes LOF
Output: A quantitive comparison of different classfication techniques and feature selection techniques that gives best resulyts for the dataset.
This project contains a complete report and a short presentation of our findings.
Presentation Link: https://www.dropbox.com/s/4y9v5fn1md517t5/Data%20Mining%20Presentations.pptx?dl=0 ReportLink: https://www.dropbox.com/s/7ohozv868yivo5f/Data%20Mining%20Project%20Report.pdf?dl=0