20 Hits in 10.8 sec

Forecasting the nearly unforecastable: why aren't airline bookings adhering to the prediction algorithm?

Saravanan Thirumuruganathan, Soon-gyo Jung, Dianne Ramirez Robillos, Joni Salminen, Bernard J. Jansen
2021 Electronic Commerce Research  
A unique aspect of the model is the incorporation of self-competence, where the model defers when it cannot reasonably make a recommendation.  ...  We then compare the performance of the Next Likely Destination model in a real-life consumer study with 35,000 actual airline customers.  ...  Acknowledgements We thank the international airline company for its collaboration in this research.  ... 
doi:10.1007/s10660-021-09457-0 fatcat:5zin2rli5nfnnoetq4fwjg3klq

Collaborative Filtering Bandits

Shuai Li, Alexandros Karatzoglou, Claudio Gentile
2016 Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval - SIGIR '16  
In this work, we investigate an adaptive clustering technique for content recommendation based on exploration-exploitation strategies in contextual multi-armed bandit settings.  ...  We also provide a regret analysis within a standard linear stochastic noise setting.  ...  We would like to thank the anonymous reviewers for their helpful and constructive comments. The first author thanks the support from MIUR and QCRI-HBKU.  ... 
doi:10.1145/2911451.2911548 dblp:conf/sigir/LiKG16 fatcat:3m3aussumjco3i7qerbkxs5fxm

Data-driven software design with Constraint Oriented Multi-variate Bandit Optimization (COMBO)

Rasmus Ros, Mikael Hammar
2020 Empirical Software Engineering  
Method The toolkit was validated in a proof-of-concept by implementing two features that are relevant to Apptus, an e-commerce company that develops algorithms for web shops.  ...  Context Software design in e-commerce can be improved with user data through controlled experiments (i.e. A/B tests) to better meet user needs.  ...  Validation Company and e-Commerce The validation company Apptus is a small Swedish company which develops a platform for e-commerce.  ... 
doi:10.1007/s10664-020-09856-1 fatcat:ym7nvhzogjhnbm5qxxzonu7si4

Collaborative Filtering Bandits [article]

Shuai Li and Alexandros Karatzoglou and Claudio Gentile
2016 arXiv   pre-print
In this work, we investigate an adaptive clustering technique for content recommendation based on exploration-exploitation strategies in contextual multi-armed bandit settings.  ...  We also provide a regret analysis within a standard linear stochastic noise setting.  ...  We would like to thank the anonymous reviewers for their helpful and constructive comments. The first author thanks the support from MIUR and QCRI-HBKU.  ... 
arXiv:1502.03473v7 fatcat:wvbsmyhdyvbb5i7ulgqh7gpxzm

Reinforcement Learning in Practice: Opportunities and Challenges [article]

Yuxi Li
2022 arXiv   pre-print
Then we discuss opportunities of RL, in particular, products and services, games, bandits, recommender systems, robotics, transportation, finance and economics, healthcare, education, combinatorial optimization  ...  This article is a gentle discussion about the field of reinforcement learning in practice, about opportunities and challenges, touching a broad range of topics, with perspectives and without technical  ...  That is, recommendation is about decision making. Here we take the narrow view of recommendations as discussed above, in particular, in Web and e-commerce applications.  ... 
arXiv:2202.11296v2 fatcat:xdtsmme22rfpfn6rgfotcspnhy

Mitigating Bias in Algorithmic Systems - A Fish-Eye View

Kalia Orphanou, Jahna Otterbacher, Styliani Kleanthous, Khuyagbaatar Batsuren, Fausto Giunchiglia, Veronika Bogina, Avital Shulner-Tal, Alan Hartman, Tsvi Kuflik
2021 Zenodo  
Mitigating bias in algorithmic systems is a critical issue drawing attention across communities within the information and computer sciences.  ...  Given the complexity of the problem and the involvement of multiple stakeholders – including developers, end-users and third-parties – there is a need to understand the landscape of the sources of bias  ...  [143] detect the groups in the dataset that are unfairly treated by the classifier by developing an exploration-exploitation based strategy.  ... 
doi:10.5281/zenodo.6240582 fatcat:vftoi4woebhrrp5tlmkclabgf4

CHAMELEON: A Deep Learning Meta-Architecture for News Recommender Systems [Phd. Thesis] [article]

Gabriel de Souza Pereira Moreira
2019 arXiv   pre-print
A method is proposed for a realistic temporal offline evaluation of such task, replaying the stream of user clicks and fresh articles being continuously published in a news portal.  ...  problem, when compared to other traditional and state-of-the-art session-based recommendation algorithms.  ...  Their experimental results on an e-commerce dataset and on a movies dataset have shown that the CA-RNN outperformed other competitive sequential and context-aware models.  ... 
arXiv:2001.04831v1 fatcat:x2k3u26i4jebzjlesswnncfepq

Search Engines that Learn from Their Users

Anne Schuth
2016 SIGIR Forum  
For this reason it is vital to have a diverse set of experimental systems contributing to the pooling.  ...  The online and offline metrics together thus tell us how the system dealt with the exploration-exploitation trade off [178] .  ...  In OpenSearch, our living labs setting, we focus exclusively on head queries for a number of reasons: 1. This allows us to evaluate experimental search systems on the same set of queries. 2.  ... 
doi:10.1145/2964797.2964817 fatcat:lk24shg7dzbyzk7kkr4x6cjbna

Deep Reinforcement Learning, a textbook [article]

Aske Plaat
2022 arXiv   pre-print
The aim of this book is to provide a comprehensive overview of the field of deep reinforcement learning.  ...  We assume an undergraduate-level of understanding of computer science and artificial intelligence; the programming language of this book is Python.  ...  Bandit Theory The exploration/exploitation trade-o , the question of how to get the most reliable information at the least cost, has been studied extensively in the literature [346, 845] .  ... 
arXiv:2201.02135v2 fatcat:3icsopexerfzxa3eblpu5oal64

Distributed reinforcement learning for adaptive and robust network intrusion response

Kleanthis Malialis, Sam Devlin, Daniel Kudenko
2015 Connection science  
The increasing adoption of technologies and the exponential growth of networks has made the area of information technology an integral part of our lives, where network security plays a vital role.  ...  Such an attack is designed to exhaust a server's resources or congest a network's infrastructure, and therefore  ...  Exploration-Exploitation Trade-off The exploration-exploitation trade-off constitutes a critical issue in the design of an RL agent.  ... 
doi:10.1080/09540091.2015.1031082 fatcat:vzwfb5cclzdqdhwgxzqozxe3xi

Counterfactual Evaluation And Learning From Logged User Feedback

Adith Swaminathan
We will view these applications through the lens of causal inference and modularize the problem of building a good ranking engine or recommender system into two components -first, infer a plausible assignment  ...  This study will yield new learning principles, algorithms and insights into the design of statistical estimators for counterfactual learning.  ...  For training directly from user interactions, the current state of the art explore-exploit algorithms demand interactive experimental control over the actions of a system to decide under uncertainty effectively  ... 
doi:10.7298/x4fj2dw6 fatcat:o4azwkhczjhhpbg34ceyrkuizy

Machine Learning for Marketing Decision Support

Johannes Sebastian Haupt, Humboldt-Universität Zu Berlin
These results form the basis for a machine-learning-based system for the detection and deletion of tracking elements from emails.  ...  An analysis of data collection practices in direct marketing emails reveals the ubiquity of tracking mechanisms without user consent in e-commerce communication.  ...  Experimental Design Data and Experimental Setting The experimental setting is based on a real-time targeting process in e-commerce.  ... 
doi:10.18452/21554 fatcat:7xcuy76f7rh2rewuxlf4kga4om

Choices, Risks, and Reward Reports: Charting Public Policy for Reinforcement Learning Systems

Thomas Krendl Gilbert, Sarah Dean, Tom Zick, Nathan Lambert
Building on the "model cards" and "datasheets" frameworks proposed by Mitchell et al. and Gebru et al., we argue the need for Reward Reports for AI systems.  ...  We argue that criteria for these choices may be drawn from emerging subfields within antitrust, tort, and administrative law.  ...  The authors also wish to thank the Center for Long-Term Cybersecurity and the Center for Human-Compatible AI for supporting previous stages of research resulting in this paper.  ... 
doi:10.48550/arxiv.2202.05716 fatcat:7lbfkuwkp5ajjfil6comv7oi3i

Conversational Information Seeking [article]

Hamed Zamani, Johanne R. Trippas, Jeff Dalton, Filip Radlinski
Conversational information seeking (CIS) is concerned with a sequence of interactions between one or more users and an information system.  ...  Interactions in CIS are primarily based on natural language dialogue, while they may include other types of interactions, such as click, touch, and body gestures.  ...  Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsors.  ... 
doi:10.48550/arxiv.2201.08808 fatcat:l2fzanmuv5ezhpwkhvgbrjv63m

Improved empirical methods in reinforcement-learning evaluation

Vukosi N. Marivate
We also develop a formal framework for characterizing the "capacity" of a space of parameterized RL algorithms and bound the generalization error of a set of algorithms on a distribution of RL environments  ...  Second, we develop a method for evaluating RL algorithms offline using a static collection of data.  ...  is known as the exploration-exploitation tradeoff (Sutton, 1988) .  ... 
doi:10.7282/t34x59h0 fatcat:i54tn2hkafdpvnd22ecg653igi
« Previous Showing results 1 — 15 out of 20 results