Filters








15,627 Hits in 7.6 sec

Study of a bias in the offline evaluation of a recommendation algorithm [article]

Arnaud De Myttenaere , Bénédicte Le Grand
2015 arXiv   pre-print
It thus influences the way users interact with the system and, as a consequence, bias the evaluation of the performance of a recommendation algorithm computed using historical data (via offline evaluation  ...  This paper describes this bias and discuss the relevance of a weighted offline evaluation to reduce this bias for different classes of recommendation algorithms.  ...  Thus those campaigns have strongly biased the collected data, leading to a significant bias in the offline evaluation.  ... 
arXiv:1511.01280v1 fatcat:s4j3pggganfibk4ninktcssubu

Reducing offline evaluation bias of collaborative filtering algorithms [article]

Arnaud De Myttenaere , Bénédicte Le Grand
2015 arXiv   pre-print
It thus influences the way users interact with the system and, as a consequence, bias the evaluation of the performance of a recommendation algorithm computed using historical data (via offline evaluation  ...  This paper presents a new application of a weighted offline evaluation to reduce this bias for collaborative filtering algorithms.  ...  Thus those campaigns have strongly biased the collected data, leading to a significant bias in the offline evaluation score.  ... 
arXiv:1506.04135v1 fatcat:osqalsytu5gbrdpt5jbjhhs6ai

Comparing Offline and Online Recommender System Evaluations on Long-tail Distributions

Gabriel de Souza Pereira Moreira, Gilmar Alves de Souza, Adilson Marques da Cunha
2015 ACM Conference on Recommender Systems  
In this investigation, we conduct a comparison between offline and online accuracy evaluation of different algorithms and settings in a real-world content recommender system.  ...  By focusing on recommendations of long-tail items, which are usually more interesting for users, it was possible to reduce the bias caused by extremely popular items and to observe a better alignment of  ...  ACKNOWLEDGEMENTS Our thanks to CI&T for supporting the development of Smart Canvas R recommender system evaluation framework and to the ITA for providing the research environment.  ... 
dblp:conf/recsys/MoreiraSC15 fatcat:etn7rpylt5ggndgxmqm2tzx5by

Reducing Offline Evaluation Bias in Recommendation Systems [article]

Arnaud De Myttenaere, Boris Golden
2014 arXiv   pre-print
This adaptation process influences the way users interact with the system and, as a consequence, increases the difficulty of evaluating a recommendation algorithm with historical data (via offline evaluation  ...  This paper analyses this evaluation bias and proposes a simple item weighting solution that reduces its impact.  ...  A strong assumption we make is that in practice reducing offline evaluation bias for constant algorithms contributes to reducing offline evaluation bias for all algorithms.  ... 
arXiv:1407.0822v1 fatcat:vjrof7qe4jaufa5bml4rrfl5jq

Do Offline Metrics Predict Online Performance in Recommender Systems? [article]

Karl Krauth, Sarah Dean, Alex Zhao, Wenshuo Guo, Mihaela Curmei, Benjamin Recht, Michael I. Jordan
2020 arXiv   pre-print
We study the impact of adding exploration strategies, and observe that their effectiveness, when compared to greedy recommendation, is highly dependent on the recommendation algorithm.  ...  As a result, many state-of-the-art algorithms are designed to solve supervised learning problems, and progress is judged only by offline metrics.  ...  Although this is a limitation of our work, there is significant value in studying algorithms and metrics in a simplified setting.  ... 
arXiv:2011.07931v1 fatcat:fre2cuepjzcv5gtnk3ulnblywu

Revisiting offline evaluation for implicit-feedback recommender systems

Olivier Jeunen
2019 Proceedings of the 13th ACM Conference on Recommender Systems - RecSys '19  
Recommender systems are typically evaluated in an offline setting.  ...  A subset of the available user-item interactions is sampled to serve as test set, and some model trained on the remaining data points is then evaluated on its performance to predict which interactions  ...  The biases present in these datasets pose a significant challenge when the data is used to evaluate other competing algorithms in an offline manner.  ... 
doi:10.1145/3298689.3347069 dblp:conf/recsys/Jeunen19 fatcat:tlm64i2mbza6hequt4xyrhl4zu

Estimating Error and Bias in Offline Evaluation Results

Mucun Tian, Michael D. Ekstrand
2020 Proceedings of the 2020 Conference on Human Information Interaction and Retrieval  
We present a simulation study to estimate the error that such missing data causes in commonly-used evaluation metrics in order to assess its prevalence and impact.  ...  Offline evaluations of recommender systems attempt to estimate users' satisfaction with recommendations using static data from prior user interactions.  ...  Offline evaluation cannot accurately measure the effectiveness of truly novel recommendations: if a recommender algorithm reliably finds items the user has never heard of, but would enjoy, the evaluation  ... 
doi:10.1145/3343413.3378004 dblp:conf/chiir/TianE20 fatcat:dofm7765ircrzbk5tjyunrr2q4

Offline Evaluation and Optimization for Interactive Systems

Lihong Li
2015 Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM '15  
a news recommendation system • click lift of a new user feature in ad ranking • reduction of time for user to find a relevant URL on SERP • … Sport User Article Click Movie Article Overall  ...  evaluation policy Data , , ⋮ , , Biases of Direct Method • Sampling/selection bias • From production systems • Simpson's paradox • Modeling bias • Insufficient features to fully represent  ...  recommendation] • Choice #2: randomize around current/production policy [Speller] • More exploration with causes greater potential risk harmonic: • Can log randomization seed in and check offline to  ... 
doi:10.1145/2684822.2697040 dblp:conf/wsdm/Li15 fatcat:2ap6hcpimfh5xogmdzif6ar6ri

Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation [article]

Xueying Bai, Jian Guan, Hongning Wang
2020 arXiv   pre-print
To reduce bias in the learned model and policy, we use a discriminator to evaluate the quality of generated data and scale the generated rewards.  ...  Our theoretical analysis and empirical evaluations demonstrate the effectiveness of our solution in learning policies from the offline and generated data.  ...  And the policy's convergence in these algorithms is not well-studied.  ... 
arXiv:1911.03845v3 fatcat:qgonaucopfavnms4cud34j6gry

A Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation

Xueying Bai, Jian Guan, Hongning Wang
2019 Neural Information Processing Systems  
To reduce bias in the learned model and policy, we use a discriminator to evaluate the quality of generated data and scale the generated rewards.  ...  Our theoretical analysis and empirical evaluations demonstrate the effectiveness of our solution in learning policies from the offline and generated data.  ...  And the policy's convergence in these algorithms is not well-studied.  ... 
dblp:conf/nips/BaiGW19 fatcat:m5lf3d2t7jcvflv4c5cgidl6su

Accelerated learning from recommender systems using multi-armed bandit [article]

Meisam Hejazinia, Kyler Eastman, Shuqin Ye, Abbas Amirabadi, Ravi Divvela
2019 arXiv   pre-print
Evaluating recommender system algorithms is a hard task, given all the inherent bias in the data, and successful companies must be able to rapidly iterate on their solution to maintain their competitive  ...  The gold standard for evaluating recommendation algorithms has been the A/B test since it is an unbiased way to estimate how well one or more algorithms compare in the real world.  ...  ACKNOWLEDGMENTS The authors would like to thank Travis Brady, Pavlos Mitsoulis Ntompos, Ben Dundee, Kurt Smith, and John Meakin for their internal review of this paper and their helpful feedback.  ... 
arXiv:1908.06158v1 fatcat:7rp3l5ea25feliymdm6cyeuska

Overview of NewsREEL'16: Multi-dimensional Evaluation of Real-Time Stream-Recommendation Algorithms [chapter]

Benjamin Kille, Andreas Lommatzsch, Gebrekirstos G. Gebremeskel, Frank Hopfgartner, Martha Larson, Jonas Seiler, Davide Malagoli, András Serény, Torben Brodt, Arjen P. de Vries
2016 Lecture Notes in Computer Science  
The CLEF News-REEL challenge is a campaign-style evaluation lab allowing participants to tackle news recommendation and to optimize and evaluate their recommender algorithms both online and offline.  ...  In the intersection of these perspectives, new insights can be gained on how to effectively evaluate real-time stream recommendation algorithms.  ...  The research leading to these results was performed in the Crow-dRec project, which has received funding from the European Union Seventh Framework Program FP7/2007-2013 under grant agreement No. 610594  ... 
doi:10.1007/978-3-319-44564-9_27 fatcat:dtmwy2ipj5di7dhywxmd45i5vq

Item Familiarity Effects in User-Centric Evaluations of Recommender Systems

Dietmar Jannach, Lukas Lerche, Michael Jugovac
2015 ACM Conference on Recommender Systems  
In this paper we report the results of a user study in which participants recruited on a crowdsourcing platform assessed system-provided recommendations in a between-subjects experimental design.  ...  The cognitive effort required by the participants for the evaluation of item recommendations in such settings depends on whether or not they already know the (features of the) recommended items.  ...  INTRODUCTION Studies with users in a controlled environment are a powerful means to assess qualities of a recommendation system which can often not be evaluated in offline experimental designs.  ... 
dblp:conf/recsys/JannachLJ15a fatcat:x72swgi4ejhf7eyur3dck6xw5e

A Comparison of Offline Evaluations, Online Evaluations, and User Studies in the Context of Research-Paper Recommender Systems [chapter]

Joeran Beel, Stefan Langer
2015 Lecture Notes in Computer Science  
In this paper, we examine and discuss the appropriateness of different evaluation methods, i.e. offline evaluations, online evaluations, and user studies, in the context of research-paper recommender systems  ...  This is also true in the field of research-paper recommender systems, where the majority of recommendation approaches are evaluated offline, and only 34% of the approaches are evaluated with user studies  ...  A recommender system might even recommend papers of higher relevance than those in the offline dataset, but the evaluation would give the algorithm a poor rating.  ... 
doi:10.1007/978-3-319-24592-8_12 fatcat:l6aklaw7bzb6piwd6dp3ya6fja

A comparative analysis of offline and online evaluations and discussion of research paper recommender system evaluation

Joeran Beel, Marcel Genzmehr, Stefan Langer, Andreas Nürnberger, Bela Gipp
2013 Proceedings of the International Workshop on Reproducibility and Replication in Recommender Systems Evaluation - RepSys '13  
We conducted a study in which we evaluated various recommendation approaches with both offline and online evaluations.  ...  Offline evaluations are the most common evaluation method for research paper recommender systems.  ...  In contrast to user studies and online evaluations, offline evaluations measure the accuracy of a recommender system.  ... 
doi:10.1145/2532508.2532511 dblp:conf/recsys/BeelGLNG13 fatcat:2bxioctjfrhgnne223bgjip5le
« Previous Showing results 1 — 15 out of 15,627 results