Alex Beutel, Kenton Murray, Christos Faloutsos, Alexander J. Smola
2014 Proceedings of the 23rd international conference on World wide web - WWW '14  
Given a large dataset of users' ratings of movies, what is the best model to accurately predict which movies a person will like? And how can we prevent spammers from tricking our algorithms into suggesting a bad movie? Is it possible to infer structure between movies simultaneously? In this paper we describe a unified Bayesian approach to Collaborative Filtering that accomplishes all of these goals. It models the discrete structure of ratings and is flexible to the often non-Gaussian shape of
more » ... Gaussian shape of the distribution. Additionally, our method finds a co-clustering of the users and items, which improves the model's accuracy and makes the model robust to fraud. We offer three main contributions: (1) We provide a novel model and Gibbs sampling algorithm that accurately models the quirks of real world ratings, such as convex ratings distributions. (2) We provide proof of our model's robustness to spam and anomalous behavior. (3) We use several real world datasets to demonstrate the model's effectiveness in accurately predicting user's ratings, avoiding prediction skew in the face of injected spam, and finding interesting patterns in real world ratings data.
doi:10.1145/2566486.2568040 dblp:conf/www/BeutelMFS14 fatcat:jfc6bsoxzfahdd2usxjb2wuo2e