A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Forecasting the nearly unforecastable: why aren't airline bookings adhering to the prediction algorithm?
2021
Electronic Commerce Research
A unique aspect of the model is the incorporation of self-competence, where the model defers when it cannot reasonably make a recommendation. ...
We then compare the performance of the Next Likely Destination model in a real-life consumer study with 35,000 actual airline customers. ...
Acknowledgements We thank the international airline company for its collaboration in this research. ...
doi:10.1007/s10660-021-09457-0
fatcat:5zin2rli5nfnnoetq4fwjg3klq
Collaborative Filtering Bandits
2016
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval - SIGIR '16
In this work, we investigate an adaptive clustering technique for content recommendation based on exploration-exploitation strategies in contextual multi-armed bandit settings. ...
We also provide a regret analysis within a standard linear stochastic noise setting. ...
We would like to thank the anonymous reviewers for their helpful and constructive comments. The first author thanks the support from MIUR and QCRI-HBKU. ...
doi:10.1145/2911451.2911548
dblp:conf/sigir/LiKG16
fatcat:3m3aussumjco3i7qerbkxs5fxm
Data-driven software design with Constraint Oriented Multi-variate Bandit Optimization (COMBO)
2020
Empirical Software Engineering
Method The toolkit was validated in a proof-of-concept by implementing two features that are relevant to Apptus, an e-commerce company that develops algorithms for web shops. ...
Context Software design in e-commerce can be improved with user data through controlled experiments (i.e. A/B tests) to better meet user needs. ...
Validation Company and e-Commerce The validation company Apptus is a small Swedish company which develops a platform for e-commerce. ...
doi:10.1007/s10664-020-09856-1
fatcat:ym7nvhzogjhnbm5qxxzonu7si4
Collaborative Filtering Bandits
[article]
2016
arXiv
pre-print
In this work, we investigate an adaptive clustering technique for content recommendation based on exploration-exploitation strategies in contextual multi-armed bandit settings. ...
We also provide a regret analysis within a standard linear stochastic noise setting. ...
We would like to thank the anonymous reviewers for their helpful and constructive comments. The first author thanks the support from MIUR and QCRI-HBKU. ...
arXiv:1502.03473v7
fatcat:wvbsmyhdyvbb5i7ulgqh7gpxzm
Reinforcement Learning in Practice: Opportunities and Challenges
[article]
2022
arXiv
pre-print
Then we discuss opportunities of RL, in particular, products and services, games, bandits, recommender systems, robotics, transportation, finance and economics, healthcare, education, combinatorial optimization ...
This article is a gentle discussion about the field of reinforcement learning in practice, about opportunities and challenges, touching a broad range of topics, with perspectives and without technical ...
That is, recommendation is about decision making. Here we take the narrow view of recommendations as discussed above, in particular, in Web and e-commerce applications. ...
arXiv:2202.11296v2
fatcat:xdtsmme22rfpfn6rgfotcspnhy
Mitigating Bias in Algorithmic Systems - A Fish-Eye View
2021
Zenodo
Mitigating bias in algorithmic systems is a critical issue drawing attention across communities within the information and computer sciences. ...
Given the complexity of the problem and the involvement of multiple stakeholders – including developers, end-users and third-parties – there is a need to understand the landscape of the sources of bias ...
[143] detect the groups in the dataset that are unfairly treated by the classifier by developing an exploration-exploitation based strategy. ...
doi:10.5281/zenodo.6240582
fatcat:vftoi4woebhrrp5tlmkclabgf4
CHAMELEON: A Deep Learning Meta-Architecture for News Recommender Systems [Phd. Thesis]
[article]
2019
arXiv
pre-print
A method is proposed for a realistic temporal offline evaluation of such task, replaying the stream of user clicks and fresh articles being continuously published in a news portal. ...
problem, when compared to other traditional and state-of-the-art session-based recommendation algorithms. ...
Their experimental results on an e-commerce dataset and on a movies dataset have shown that the CA-RNN outperformed other competitive sequential and context-aware models. ...
arXiv:2001.04831v1
fatcat:x2k3u26i4jebzjlesswnncfepq
Search Engines that Learn from Their Users
2016
SIGIR Forum
For this reason it is vital to have a diverse set of experimental systems contributing to the pooling. ...
The online and offline metrics together thus tell us how the system dealt with the exploration-exploitation trade off [178] . ...
In OpenSearch, our living labs setting, we focus exclusively on head queries for a number of reasons: 1. This allows us to evaluate experimental search systems on the same set of queries. 2. ...
doi:10.1145/2964797.2964817
fatcat:lk24shg7dzbyzk7kkr4x6cjbna
Deep Reinforcement Learning, a textbook
[article]
2022
arXiv
pre-print
The aim of this book is to provide a comprehensive overview of the field of deep reinforcement learning. ...
We assume an undergraduate-level of understanding of computer science and artificial intelligence; the programming language of this book is Python. ...
Bandit Theory The exploration/exploitation trade-o , the question of how to get the most reliable information at the least cost, has been studied extensively in the literature [346, 845] . ...
arXiv:2201.02135v2
fatcat:3icsopexerfzxa3eblpu5oal64
Distributed reinforcement learning for adaptive and robust network intrusion response
2015
Connection science
The increasing adoption of technologies and the exponential growth of networks has made the area of information technology an integral part of our lives, where network security plays a vital role. ...
Such an attack is designed to exhaust a server's resources or congest a network's infrastructure, and therefore ...
Exploration-Exploitation Trade-off The exploration-exploitation trade-off constitutes a critical issue in the design of an RL agent. ...
doi:10.1080/09540091.2015.1031082
fatcat:vzwfb5cclzdqdhwgxzqozxe3xi
Counterfactual Evaluation And Learning From Logged User Feedback
2017
We will view these applications through the lens of causal inference and modularize the problem of building a good ranking engine or recommender system into two components -first, infer a plausible assignment ...
This study will yield new learning principles, algorithms and insights into the design of statistical estimators for counterfactual learning. ...
For training directly from user interactions, the current state of the art explore-exploit algorithms demand interactive experimental control over the actions of a system to decide under uncertainty effectively ...
doi:10.7298/x4fj2dw6
fatcat:o4azwkhczjhhpbg34ceyrkuizy
Machine Learning for Marketing Decision Support
2020
These results form the basis for a machine-learning-based system for the detection and deletion of tracking elements from emails. ...
An analysis of data collection practices in direct marketing emails reveals the ubiquity of tracking mechanisms without user consent in e-commerce communication. ...
Experimental Design
Data and Experimental Setting The experimental setting is based on a real-time targeting process in e-commerce. ...
doi:10.18452/21554
fatcat:7xcuy76f7rh2rewuxlf4kga4om
Choices, Risks, and Reward Reports: Charting Public Policy for Reinforcement Learning Systems
2022
Building on the "model cards" and "datasheets" frameworks proposed by Mitchell et al. and Gebru et al., we argue the need for Reward Reports for AI systems. ...
We argue that criteria for these choices may be drawn from emerging subfields within antitrust, tort, and administrative law. ...
The authors also wish to thank the Center for Long-Term Cybersecurity and the Center for Human-Compatible AI for supporting previous stages of research resulting in this paper. ...
doi:10.48550/arxiv.2202.05716
fatcat:7lbfkuwkp5ajjfil6comv7oi3i
Conversational Information Seeking
[article]
2022
Conversational information seeking (CIS) is concerned with a sequence of interactions between one or more users and an information system. ...
Interactions in CIS are primarily based on natural language dialogue, while they may include other types of interactions, such as click, touch, and body gestures. ...
Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsors. ...
doi:10.48550/arxiv.2201.08808
fatcat:l2fzanmuv5ezhpwkhvgbrjv63m
Improved empirical methods in reinforcement-learning evaluation
2015
We also develop a formal framework for characterizing the "capacity" of a space of parameterized RL algorithms and bound the generalization error of a set of algorithms on a distribution of RL environments ...
Second, we develop a method for evaluating RL algorithms offline using a static collection of data. ...
is known as the exploration-exploitation tradeoff (Sutton, 1988) . ...
doi:10.7282/t34x59h0
fatcat:i54tn2hkafdpvnd22ecg653igi
« Previous
Showing results 1 — 15 out of 20 results