Spam Review Detection using the Linguistic and Spammer Behavioral Methods

Naveed Hussain, Hamid Turab Mirza, Ibrar Hussain, Faiza Iqbal, Imran Memon
2020 IEEE Access  
Online reviews regarding different products or services have become the main source to determine public opinions. Consequently, manufacturers and sellers are extremely concerned with customer reviews as these have a direct impact on their businesses. Unfortunately, to gain profits or fame, spam reviews are written to promote or demote targeted products or services. This practice is known as review spamming. In recent years, the spam review detection problem has gained much attention from
more » ... ties and researchers, but still there is a need to perform experiments on real-world large-scale review datasets. This can help to analyze the impact of widespread opinion spam in online reviews. In this work, two different spam review detection methods have been proposed: (1) Spam Review Detection using Behavioral Method (SRD-BM) utilizes thirteen different spammer's behavioral features to calculate the review spam score which is then used to identify spammers and spam reviews, and (2) Spam Review Detection using Linguistic Method (SRD-LM) works on the content of the reviews and utilizes transformation, feature selection and classification to identify the spam reviews. Experimental evaluations are conducted on a real-world Amazon review dataset which analyze 26.7 million reviews and 15.4 million reviewers. The evaluations show that both proposed models have significantly improved the detection process of spam reviews. Specifically, SRD-BM achieved 93.1% accuracy whereas SRD-LM achieved 88.5% accuracy in spam review detection. Comparatively, SRD-BM achieved better accuracy because it works on utilizing rich set of spammers behavioral features of review dataset which provides in-depth analysis of spammer behaviour. Moreover, both proposed models outperformed existing approaches when compared in terms of accurate identification of spam reviews. To the best of our knowledge, this is the first study of its kind which uses large-scale review dataset to analyze different spammers' behavioral features and linguistic method utilizing different available classifiers. INDEX TERMS Online product reviews, spam reviews, spam review detection, linguistic features, spammer behavioral features.
doi:10.1109/access.2020.2979226 fatcat:4o3zwel3qfgubblejwcjndw7sy