A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
On the Difficulty of Evaluating Baselines: A Study on Recommender Systems
[article]
2019
arXiv
pre-print
Numerical evaluations with comparisons to baselines play a central role when judging research in recommender systems. In this paper, we show that running baselines properly is difficult. We demonstrate this issue on two extensively studied datasets. First, we show that results for baselines that have been used in numerous publications over the past five years for the Movielens 10M benchmark are suboptimal. With a careful setup of a vanilla matrix factorization baseline, we are not only able to
arXiv:1905.01395v1
fatcat:4qka2twxuzgzfpu6wqae2ka6ay