A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Study of a bias in the offline evaluation of a recommendation algorithm
[article]
2015
arXiv
pre-print
It thus influences the way users interact with the system and, as a consequence, bias the evaluation of the performance of a recommendation algorithm computed using historical data (via offline evaluation ...
This paper describes this bias and discuss the relevance of a weighted offline evaluation to reduce this bias for different classes of recommendation algorithms. ...
Thus those campaigns have strongly biased the collected data, leading to a significant bias in the offline evaluation. ...
arXiv:1511.01280v1
fatcat:s4j3pggganfibk4ninktcssubu
Reducing offline evaluation bias of collaborative filtering algorithms
[article]
2015
arXiv
pre-print
It thus influences the way users interact with the system and, as a consequence, bias the evaluation of the performance of a recommendation algorithm computed using historical data (via offline evaluation ...
This paper presents a new application of a weighted offline evaluation to reduce this bias for collaborative filtering algorithms. ...
Thus those campaigns have strongly biased the collected data, leading to a significant bias in the offline evaluation score. ...
arXiv:1506.04135v1
fatcat:osqalsytu5gbrdpt5jbjhhs6ai
Comparing Offline and Online Recommender System Evaluations on Long-tail Distributions
2015
ACM Conference on Recommender Systems
In this investigation, we conduct a comparison between offline and online accuracy evaluation of different algorithms and settings in a real-world content recommender system. ...
By focusing on recommendations of long-tail items, which are usually more interesting for users, it was possible to reduce the bias caused by extremely popular items and to observe a better alignment of ...
ACKNOWLEDGEMENTS Our thanks to CI&T for supporting the development of Smart Canvas R recommender system evaluation framework and to the ITA for providing the research environment. ...
dblp:conf/recsys/MoreiraSC15
fatcat:etn7rpylt5ggndgxmqm2tzx5by
Reducing Offline Evaluation Bias in Recommendation Systems
[article]
2014
arXiv
pre-print
This adaptation process influences the way users interact with the system and, as a consequence, increases the difficulty of evaluating a recommendation algorithm with historical data (via offline evaluation ...
This paper analyses this evaluation bias and proposes a simple item weighting solution that reduces its impact. ...
A strong assumption we make is that in practice reducing offline evaluation bias for constant algorithms contributes to reducing offline evaluation bias for all algorithms. ...
arXiv:1407.0822v1
fatcat:vjrof7qe4jaufa5bml4rrfl5jq
Do Offline Metrics Predict Online Performance in Recommender Systems?
[article]
2020
arXiv
pre-print
We study the impact of adding exploration strategies, and observe that their effectiveness, when compared to greedy recommendation, is highly dependent on the recommendation algorithm. ...
As a result, many state-of-the-art algorithms are designed to solve supervised learning problems, and progress is judged only by offline metrics. ...
Although this is a limitation of our work, there is significant value in studying algorithms and metrics in a simplified setting. ...
arXiv:2011.07931v1
fatcat:fre2cuepjzcv5gtnk3ulnblywu
Revisiting offline evaluation for implicit-feedback recommender systems
2019
Proceedings of the 13th ACM Conference on Recommender Systems - RecSys '19
Recommender systems are typically evaluated in an offline setting. ...
A subset of the available user-item interactions is sampled to serve as test set, and some model trained on the remaining data points is then evaluated on its performance to predict which interactions ...
The biases present in these datasets pose a significant challenge when the data is used to evaluate other competing algorithms in an offline manner. ...
doi:10.1145/3298689.3347069
dblp:conf/recsys/Jeunen19
fatcat:tlm64i2mbza6hequt4xyrhl4zu
Estimating Error and Bias in Offline Evaluation Results
2020
Proceedings of the 2020 Conference on Human Information Interaction and Retrieval
We present a simulation study to estimate the error that such missing data causes in commonly-used evaluation metrics in order to assess its prevalence and impact. ...
Offline evaluations of recommender systems attempt to estimate users' satisfaction with recommendations using static data from prior user interactions. ...
Offline evaluation cannot accurately measure the effectiveness of truly novel recommendations: if a recommender algorithm reliably finds items the user has never heard of, but would enjoy, the evaluation ...
doi:10.1145/3343413.3378004
dblp:conf/chiir/TianE20
fatcat:dofm7765ircrzbk5tjyunrr2q4
Offline Evaluation and Optimization for Interactive Systems
2015
Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM '15
a news recommendation system • click lift of a new user feature in ad ranking • reduction of time for user to find a relevant URL on SERP • … Sport
User
Article
Click
Movie
Article
Overall ...
evaluation
policy
Data
, ,
⋮
, ,
Biases of Direct Method
• Sampling/selection bias
• From production systems
• Simpson's paradox
• Modeling bias
• Insufficient features to fully represent ...
recommendation] • Choice #2: randomize around current/production policy [Speller] • More exploration with causes greater potential risk harmonic: • Can log randomization seed in and check offline to ...
doi:10.1145/2684822.2697040
dblp:conf/wsdm/Li15
fatcat:2ap6hcpimfh5xogmdzif6ar6ri
Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation
[article]
2020
arXiv
pre-print
To reduce bias in the learned model and policy, we use a discriminator to evaluate the quality of generated data and scale the generated rewards. ...
Our theoretical analysis and empirical evaluations demonstrate the effectiveness of our solution in learning policies from the offline and generated data. ...
And the policy's convergence in these algorithms is not well-studied. ...
arXiv:1911.03845v3
fatcat:qgonaucopfavnms4cud34j6gry
A Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation
2019
Neural Information Processing Systems
To reduce bias in the learned model and policy, we use a discriminator to evaluate the quality of generated data and scale the generated rewards. ...
Our theoretical analysis and empirical evaluations demonstrate the effectiveness of our solution in learning policies from the offline and generated data. ...
And the policy's convergence in these algorithms is not well-studied. ...
dblp:conf/nips/BaiGW19
fatcat:m5lf3d2t7jcvflv4c5cgidl6su
Accelerated learning from recommender systems using multi-armed bandit
[article]
2019
arXiv
pre-print
Evaluating recommender system algorithms is a hard task, given all the inherent bias in the data, and successful companies must be able to rapidly iterate on their solution to maintain their competitive ...
The gold standard for evaluating recommendation algorithms has been the A/B test since it is an unbiased way to estimate how well one or more algorithms compare in the real world. ...
ACKNOWLEDGMENTS The authors would like to thank Travis Brady, Pavlos Mitsoulis Ntompos, Ben Dundee, Kurt Smith, and John Meakin for their internal review of this paper and their helpful feedback. ...
arXiv:1908.06158v1
fatcat:7rp3l5ea25feliymdm6cyeuska
Overview of NewsREEL'16: Multi-dimensional Evaluation of Real-Time Stream-Recommendation Algorithms
[chapter]
2016
Lecture Notes in Computer Science
The CLEF News-REEL challenge is a campaign-style evaluation lab allowing participants to tackle news recommendation and to optimize and evaluate their recommender algorithms both online and offline. ...
In the intersection of these perspectives, new insights can be gained on how to effectively evaluate real-time stream recommendation algorithms. ...
The research leading to these results was performed in the Crow-dRec project, which has received funding from the European Union Seventh Framework Program FP7/2007-2013 under grant agreement No. 610594 ...
doi:10.1007/978-3-319-44564-9_27
fatcat:dtmwy2ipj5di7dhywxmd45i5vq
Item Familiarity Effects in User-Centric Evaluations of Recommender Systems
2015
ACM Conference on Recommender Systems
In this paper we report the results of a user study in which participants recruited on a crowdsourcing platform assessed system-provided recommendations in a between-subjects experimental design. ...
The cognitive effort required by the participants for the evaluation of item recommendations in such settings depends on whether or not they already know the (features of the) recommended items. ...
INTRODUCTION Studies with users in a controlled environment are a powerful means to assess qualities of a recommendation system which can often not be evaluated in offline experimental designs. ...
dblp:conf/recsys/JannachLJ15a
fatcat:x72swgi4ejhf7eyur3dck6xw5e
A Comparison of Offline Evaluations, Online Evaluations, and User Studies in the Context of Research-Paper Recommender Systems
[chapter]
2015
Lecture Notes in Computer Science
In this paper, we examine and discuss the appropriateness of different evaluation methods, i.e. offline evaluations, online evaluations, and user studies, in the context of research-paper recommender systems ...
This is also true in the field of research-paper recommender systems, where the majority of recommendation approaches are evaluated offline, and only 34% of the approaches are evaluated with user studies ...
A recommender system might even recommend papers of higher relevance than those in the offline dataset, but the evaluation would give the algorithm a poor rating. ...
doi:10.1007/978-3-319-24592-8_12
fatcat:l6aklaw7bzb6piwd6dp3ya6fja
A comparative analysis of offline and online evaluations and discussion of research paper recommender system evaluation
2013
Proceedings of the International Workshop on Reproducibility and Replication in Recommender Systems Evaluation - RepSys '13
We conducted a study in which we evaluated various recommendation approaches with both offline and online evaluations. ...
Offline evaluations are the most common evaluation method for research paper recommender systems. ...
In contrast to user studies and online evaluations, offline evaluations measure the accuracy of a recommender system. ...
doi:10.1145/2532508.2532511
dblp:conf/recsys/BeelGLNG13
fatcat:2bxioctjfrhgnne223bgjip5le
« Previous
Showing results 1 — 15 out of 15,627 results