A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Optimal Sequential Exploration: Bandits, Clairvoyants, and Wildcats
2013
Operations Research
The analysis relies heavily on results for bandit superprocesses, a generalization of the multiarmed bandit problem. ...
This paper was motivated by the problem of developing an optimal policy for exploring an oil and gas field in the North Sea. Where should we drill first? Where do we drill next? ...
Acknowledgments The authors are grateful to Jo Eidsvik and Gabriele Martinelli for sparking interest in this problem, for sharing their model for the North Sea example, and for many helpful conversations ...
doi:10.1287/opre.2013.1164
fatcat:d2yhzb2vkfeeba5u7uhhbleuda
Machine Learning Techniques for Stackelberg Security Games: a Survey
[article]
2016
arXiv
pre-print
, then describes how to face the problem of having attacker's payoffs not defined and how to estimate them and, finally, presents how online learning techniques have been exploited to learn a model of ...
After a brief introduction on Stackelberg Security Games (SSGs) and the poaching setting, the rest of the work presents how to model a boundedly rational attacker taking into account her human behavior ...
Whittle proposed a heuristic index policy for RMABs by considering the Lagrangian relaxation of the problem [24] . ...
arXiv:1609.09341v1
fatcat:mglaovlwvvevxcbk7waal72aoe
Long range dependency and forecasting of housing price index and mortgage market rate: evidence of subprime crisis
2015
Management Science Letters
Table 3 presents the results of the Local Whittle and Local Polynomial Whittle of GARMA (p,d,q) estimators. ...
In this work, since we proposed a new non stationary process jointly with a reliable and robust wavelet-based estimation technique, it is necessary to give out a procedure of the forecast for this novel ...
doi:10.5267/j.msl.2015.3.012
fatcat:j4gqnmjbnzhfviwlotudcbaro4
Spectral Subsampling MCMC for Stationary Time Series
[article]
2020
arXiv
pre-print
We propose a novel technique for speeding up MCMC for time series data by efficient data subsampling in the frequency domain. ...
For several challenging time series models, we demonstrate a speedup of up to two orders of magnitude while incurring negligible bias compared to MCMC on the full dataset. ...
Quiroz et al. (2019a) propose speeding up MCMC for large n by replacing L n (θ) with an estimate L(θ, u) based on a small random subsample of m n observations, where u = (u 1 , ..., u m ) indexes the ...
arXiv:1910.13627v2
fatcat:k7npozn55jcmzg3grado7qm6ju
NEO: NEuro-Inspired Optimization—A Fractional Time Series Approach
2021
Frontiers in Physiology
We provide evidence of the efficacy of the proposed method on a wide variety of settings implicitly found in practice. ...
Solving optimization problems is a recurrent theme across different fields, including large-scale machine learning systems and deep learning. ...
AUTHOR CONTRIBUTIONS SC, SD, and SP performed the research. SC was responsible for the execution of the numerical experiments and wrote the manuscript with revisions by SD and SP. ...
doi:10.3389/fphys.2021.724044
pmid:34621183
pmcid:PMC8491743
fatcat:nsvpjljvhrezbkj6ngkqopxybe
Sequential Decision Making with Limited Observation Capability: Application to Wireless Networks
[article]
2019
arXiv
pre-print
The Whittle-index policy for solving LRB problem is analyzed; indexability of LRBs is shown. ...
Further, closed-form index expressions are provided for two sets of special cases; for more general cases, an algorithm for index computation is provided. ...
The index formula for each interval is given as follow. 1) For π ∈ A 1 , the Whittle-index W (π) = ρ(π). 2) For π ∈ A 2 , we consider following cases. a) if γ 0 (p 1,0 ) ≥ π, then, the Whittle-index is ...
arXiv:1801.01301v2
fatcat:f3dl4tl2gfh4bbmrp54eprvbru
Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments
2017
Marketing science (Providence, R.I.)
Finally, we show that customer acquisition would decrease about 10% if the firm were to optimize click through rates instead of conversion directly, a finding that has implications for understanding the ...
Within a campaign, firms try to adapt to intermediate results of their tests, optimizing what they earn while learning about their ads. ...
After a model update at time t, we utilize the uncertainty around parameters β j to obtain the key distribution for our implementation of TS, the joint predictive distribution of ad conversion rates for ...
doi:10.1287/mksc.2016.1023
fatcat:oxkaeege6zeo3oopaxf7bvb5oy
Deep Reinforcement Learning for Simultaneous Sensing and Channel Access in Cognitive Networks
[article]
2021
arXiv
pre-print
To achieve this goal, we develop a novel algorithm that learns both access and sensing policies via deep Q-learning, dubbed Double Deep Q-network for Sensing and Access (DDQSA). ...
To the best of our knowledge, this is the first paper that solves both sensing and access policies for DSA via deep Q-learning. ...
In this work, we develop a novel algorithm for a single agent that learns both access and sensing policies via deep Q-learning, dubbed Double Deep Q-network for Sensing and Access (DDQSA). ...
arXiv:2110.14541v1
fatcat:3p5ntbzatrdexoyut73lorljxe
Scheduling in Time-correlated Wireless Networks with Imperfect CSI and Stringent Constraint
[article]
2014
arXiv
pre-print
In this work, we incorporate a stringent constraint on the simultaneously scheduled users and propose a low-complexity scheduling algorithm that dynamically implements user scheduling and dummy packet ...
In recent work, a low-complexity optimal solution was developed for this problem under a long-term time-average resource constraint. ...
The RMBP is Whittle indexable if every project is Whittle indexable. ...
arXiv:1403.7773v1
fatcat:3khj2fvirjbazfk2d6s5acd3f4
Bayesian Structure Learning for Stationary Time Series
[article]
2015
arXiv
pre-print
We leverage a Whittle likelihood approximation and define a conjugate prior---the hyper complex inverse Wishart---on the complex-valued and graph-constrained spectral matrices. ...
We take a Bayesian approach to structure learning, placing priors on (i) the graph structure and (ii) spectral matrices given the graph. ...
Motivated by the connection between GGMs and our TGMs, and the analogous structure of our TGM-based Whittle likelihood of Eq. (10) to that of a GGM with N i.i.d. observations, we propose a novel hyper ...
arXiv:1505.03131v2
fatcat:654al3dctrdohjqjzaqzaued4m
Nonparametric collective spectral density estimation with an application to clustering the brain signals
[article]
2017
arXiv
pre-print
In this paper, we develop a method for the simultaneous estimation of spectral density functions (SDFs) for a collection of stationary time series that share some common features. ...
A web-based shiny App found at "https://ncsde.shinyapps.io/NCSDE" is developed for visualization, training and learning the SDFs collectively using the proposed technique. ...
Conclusion A novel approach for collectively estimating multiple SDFs was developed in this paper. ...
arXiv:1704.03907v3
fatcat:s67vg2o2ofgzhetqlfzecz4noq
Cooperative Multi-Agent Reinforcement Learning Based Distributed Dynamic Spectrum Access in Cognitive Radio Networks
[article]
2021
arXiv
pre-print
We employ the deep recurrent Q-network (DRQN) to address the partial observability of the state for each cognitive user. ...
The ultimate goal is to learn a cooperative strategy which maximizes the sum throughput of cognitive radio network in distributed fashion without coordination information exchange between cognitive users ...
The algorithm is model-based and in fact it is a single-agent Q-Learning framework implemented independently on SBSs. ...
arXiv:2106.09274v1
fatcat:cb5767uktrespkeoqmsvh2bxpq
Robust Restless Bandits: Tackling Interval Uncertainty with Deep Reinforcement Learning
[article]
2021
arXiv
pre-print
novel deep reinforcement learning algorithm for solving RMABs. ...
To address this, we formulate the adversary oracle as a multi-agent reinforcement learning problem and solve it with a multi-agent extension of RMABPPO, which may be of independent interest as the first ...
Acknowledgments and Disclosure of Funding ...
arXiv:2107.01689v1
fatcat:zkenanwvcvdj3a2xiljx6s7gy4
Dynamic Assortment with Demand Learning for Seasonal Consumer Goods
2007
Management science
Focusing on a stylized version of this problem, we study a finite horizon multiarmed bandit model with several plays per stage and Bayesian learning. ...
It yields a closed-form dynamic index policy capturing the key exploration versus exploitation trade-off and associated suboptimality bounds. ...
Finally, the second author is indebted to Martha Nieto for a conversation about Zara that sparked his interest in fast-fashion companies and was key to the genesis of this project. ...
doi:10.1287/mnsc.1060.0613
fatcat:s2zxaxjm6zgmzhhind4snnyyci
Dynamic spectrum access with deep Q-learning in densely occupied and partially observable environments
2021
Telfor Journal
We have developed a novel Deep Reinforcement Learning (DRL) based DSA method which combines a double deep Q-learning architecture with a recurrent neural network and takes advantage of a prioritized experience ...
Compared with other DRL methods for DSA, the proposed solution can find a near-optimal policy in a smaller number of iterations and suits a wider range of communication environments, including dynamic ...
However, when the channels are correlated, the Myopic and Whittle Index approaches cannot be applied. ...
doi:10.5937/telfor2101001t
fatcat:alp7qmzn65frfoc4w6hhgrtin4
« Previous
Showing results 1 — 15 out of 528 results