1,322 Hits in 5.4 sec

Page 1322 of Mathematical Reviews Vol. , Issue 98B [page]

1998 Mathematical Reviews  
A Markov decision process is considered where the infinite horizon average expected reward is to be maximized subject to a number of inequality constraints on average expected costs.  ...  Summary: “This paper studies the expected average cost control problem for discrete-time Markov decision processes with denu- merably infinite state spaces.  ... 

Page 2324 of Mathematical Reviews Vol. , Issue 92d [page]

1992 Mathematical Reviews  
Algorithmic optimal policies are the infinite horizon limit of policies which are optimal for the finite horizon problems.  ...  Gittins (4-OX-EC) 92d:90111 90C40 93D21 Cavazos-Cadena, Rolando (MEX-UAAG-SC); Hernandez-Lerma, Onésimo (MEX-IPN-CI) Recursive adaptive control of Markov decision processes with the average reward criterion  ... 

Reinforcement Learning With Function Approximation for Traffic Signal Control

Prashanth LA, Shalabh Bhatnagar
2011 IEEE transactions on intelligent transportation systems (Print)  
Details: We solve infinite horizon discounted reward markov decision process (M DP ).  ...  Adapting the constraints of the RALP to include the active constraints of the ALP -critic by means of smoothed function gradient scheme. queueing networks Stochastic gradient algorithms for routing in  ...  Actor-critic algorithms based on ALP Details: We solve infinite horizon discounted reward markov decision process (M DP ).  ... 
doi:10.1109/tits.2010.2091408 fatcat:vf7vpxouxjf65cxqouso4rktlq

A Model for Multi-timescaled Sequential Decision-making Processes with Adversary

H.S. Chang
2004 Mathematical and Computer Modelling of Dynamical Systems  
Games (MMGs) for hierarchically structured sequential decision making processes in two players' competitive situations where one player (the minimizer) wishes to minimize his cost that will be paid to  ...  Extending the multi-time scale model proposed by the author et al. in the context of Markov decision processes, this paper proposes a simple analytical model called M time-scale two-person zero-sum Markov  ...  Even though we considered only the infinite horizon discounted cost criterion as a performance measure for simplicity, extending the discussion here into the infinite horizon average cost is straightforward  ... 
doi:10.1080/13873950412331335261 fatcat:nvqncyhznfcmtjacfzj75iache

Markov Decision Processes

Nicole Bäuerle, Ulrich Rieder
2010 Jahresbericht der Deutschen Mathematiker-Vereinigung (Teubner)  
We treat Markov Decision Processes with finite and infinite time horizon where we will restrict the presentation to the so-called (generalized) negative case.  ...  The theory of Markov Decision Processes is the theory of controlled Markov chains. Its origins can be traced back to R. Bellman and L. Shapley in the 1950's.  ...  He established the theory of Markov Decision Processes in Germany 40 years ago.  ... 
doi:10.1365/s13291-010-0007-2 fatcat:no6ig6xj7rcxtiltf7b5itfl2u

Page 3361 of Mathematical Reviews Vol. , Issue 87f [page]

1987 Mathematical Reviews  
Existence of optimal dynamic routing policies is proved for the long-run average and infinite-horizon discounted cases.  ...  In attempting to maximize average return per unit time over an infinite horizon each customer must make a single decision, a choice of arrival rate which must then be used to determine the time until re-entry  ... 

The Robot Routing Problem for Collecting Aggregate Stochastic Rewards [article]

Rayna Dimitrova, Ivan Gavran, Rupak Majumdar, Vinayak S. Prabhu, and Sadegh Esmaeil Zadeh Soudjani
2017 arXiv   pre-print
We consider the finite and infinite-horizon robot routing problems.  ...  For finite-horizon, the goal is to maximize the total expected reward, while for infinite horizon we consider limit-average objectives.  ...  Problem statements We investigate optimization and decision problems for finite and infinite-horizon robot routing.  ... 
arXiv:1704.05303v2 fatcat:qij5525pm5ebriltvf7vgcvn7i

Page 5242 of Mathematical Reviews Vol. , Issue 85k [page]

1985 Mathematical Reviews  
Author’s summary: “An infinite horizon, expected average cost, dynamic routing problem is formulated for a simple failure-prone queueing system, modelled as a continuous time, continuous state controlled  ...  The system objective is to find a policy for dynamically choosing the rates, based on the current rates and queue length, that minimizes the expected total discounted cost or average cost over an infinite  ... 

Route Planning: A [chapter]

Jean Walrand
2021 Probability in Electrical Engineering and Computer Science  
Section 13.4 discusses a generalization of the route planning problem: a Markov decision problem. Section 13.5 solves the problem when the horizon is infinite.  ...  AbstractThis chapter is concerned with making successive decisions in the presence of uncertainty. The decisions affect the cost at each step but also the "state" of the system.  ...  Infinite Horizon The problem of minimizing (13.6) involves a finite horizon. The problem stops at time n.  ... 
doi:10.1007/978-3-030-49995-2_13 fatcat:x3dwb42ipvgnhovr5zyg3smonu

Markov Decision Processes With Applications in Wireless Sensor Networks: A Survey

Mohammad Abu Alsheikh, Dinh Thai Hoang, Dusit Niyato, Hwee-Pink Tan, Shaowei Lin
2015 IEEE Communications Surveys and Tutorials  
This survey reviews numerous applications of the Markov decision process (MDP) framework, a powerful decision-making tool to develop adaptive algorithms and protocols for WSNs.  ...  For long service time and low maintenance cost, WSNs require adaptive and robust methods to address data exchange, topology formulation, resource and power optimization, sensing coverage and object detection  ...  By contrast, forward induction is applied when we only know the initial state. 2) Solutions for Infinite Time Horizon Markov Decision Processes: Solving an infinite time horizon MDP is more complex than  ... 
doi:10.1109/comst.2015.2420686 fatcat:422xkiciufedpljgcmfrwownau

Page 428 of Mathematical Reviews Vol. , Issue 80A [page]

1980 Mathematical Reviews  
Stochastic dynamic systems are considered which can be modelled as infinite-horizon, countable-state, continuous-time Markovian decision models.  ...  Kalin, Dieter 80a:90142 A note on: “Monotone optimal policies for Markov decision processes” (Math. Programming Stud. No. 6 (1976), 202-215) by R. F. Serfozo. “ Math.  ... 

Multitime scale markov decision processes

Hyeong Soo Chang, P.J. Fard, S.I. Marcus, M. Shayman
2003 IEEE Transactions on Automatic Control  
Index Terms- Markov decision process, multi-time scale, hierarchical control, rolling horizon time-scale model with © Þ .  ...  This paper proposes a simple analytical model called time-scale Markov Decision Process (MMDP) for hierarchically structured sequential decision making processes, where decisions in each level in the -  ...  This is because the lower bound on the result of Theorem 5.1 is 0 incorporating the fact that the infinite horizon average reward of any stationary decision rule is no bigger than the optimal infinite  ... 
doi:10.1109/tac.2003.812782 fatcat:ero44k7a3zco5cffv5omh4l4hm

Approximate receding horizon approach for Markov decision processes: average reward case

Hyeong Soo Chang, Steven I. Marcus
2003 Journal of Mathematical Analysis and Applications  
We consider an approximation scheme for solving Markov decision processes (MDPs) with countable state space, finite action space, and bounded rewards that uses an approximate solution of a fixed finite-horizon  ...  We first analyze the performance of the approximate receding horizon control for infinite-horizon average reward under an ergodicity assumption, which also generalizes the result obtained by White (J.  ...  In Section 2, we formally introduce Markov decision processes and in Section 3, we define the (approximate) receding horizon control and analyze its performance.  ... 
doi:10.1016/s0022-247x(03)00506-7 fatcat:nzx6vg7vargqnpdsc5mhvvik2a

Sensitivity Analysis for Markov Decision Process Congestion Games [article]

Sarah H.Q. Li, Daniel Calderone, Lillian Ratliff, Behcet Acikmese
2019 arXiv   pre-print
The decision makers optimize for their own expected costs, and influence each other through congestion effects on the state-action costs.  ...  We consider a non-atomic congestion game where each decision maker performs selfish optimization over states of a common MDP.  ...  INTRODUCTION Markov decision process (MDP) congestion games have been successfully used to model distributions of selfish decision makers when competing for finite resources [1] .  ... 
arXiv:1909.04167v2 fatcat:3nb25shimzf3fdjsegc7otp7sy

Robust Control of Markov Decision Processes with Uncertain Transition Matrices

Arnab Nilim, Laurent El Ghaoui
2005 Operations Research  
Hence, estimation errors are limiting factors in applying Markov decision processes to real-world problems.  ...  We consider a robust control problem for a finite-state, finite-action Markov decision process, where uncertainty on the transition matrices is described in terms of possibly nonconvex sets.  ...  Introduction Finite-state and finite-action Markov decision processes (MDPs) capture several attractive features that are important in decision making under uncertainty: they handle risk in sequential  ... 
doi:10.1287/opre.1050.0216 fatcat:sdopm7bnkfdg7cpnpjsixrff2a
« Previous Showing results 1 — 15 out of 1,322 results