A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Geometry and Determinism of Optimal Stationary Control in Partially Observable Markov Decision Processes
[article]
2016
arXiv
pre-print
For partially observable Markov decision processes (POMDPs), optimal memoryless policies are generally stochastic. ...
It is well known that for any finite state Markov decision process (MDP) there is a memoryless deterministic policy that maximizes the expected reward. ...
Partially observable Markov decision processes A discrete time partially observable Markov decision process (POMDP) is defined by a tuple (W, S, A, α, β, R), where W is a finite set of world states, S ...
arXiv:1503.07206v2
fatcat:7e7s74p5mrbkhm7rjstfcaig6q
Page 5900 of Mathematical Reviews Vol. , Issue 86m
[page]
1986
Mathematical Reviews
A. 86m:90180 Sufficient statistics in a game-theoretic problem of the control of a partially observable linear diffusion process. (Russian. ...
Zijm, Henk 86m:90174 The optimality equations in multichain denumerable state Markov decision processes with the average cost criterion: the bounded cost case.
Statist. ...
Page 2935 of Mathematical Reviews Vol. , Issue 92e
[page]
1992
Mathematical Reviews
Two control problems are considered for a partially observed Markov chain with countably infinite states. One is an infinite horizon discounted cost problem. ...
Onésimo Hernandez Lerma (Mexico City)
92e:90104 90C40
Borkar, Vivek S. (6-IIS-EE)
A remark on control of partially observed Markov chains.
Ann. Oper. Res. 29 (1991), no. 1-4, 429-438. ...
Learning grasp strategies composed of contact relative motions
2007
2007 7th IEEE-RAS International Conference on Humanoid Robots
This paper expresses the partially observable problem as a k-order Markov Decision Process (MDP) and solves it using Reinforcement Learning. ...
Since local force feedback information usually does not completely determine system state, the control problem is partially observable. ...
Section III poses grasp synthesis as an optimal control problem and solves it as a k-order Markov Decision Process. ...
doi:10.1109/ichr.2007.4813848
dblp:conf/humanoids/Platt07
fatcat:jad6ggz5fne2ljtnr5ngxggypi
Page 2247 of Mathematical Reviews Vol. , Issue 83e
[page]
1983
Mathematical Reviews
Mathematical techniques of optimization, control and decision, pp. 131-149, Birkhauser, Boston, Mass., 1981. ...
They po that such a game with a discount factor has optimal value function and both players have optimal stationary strategies. ...
Structured Replacement Policies for Components with Complex Degradation Processes and Dedicated Sensors
2011
Operations Research
Next, we formulate a single-unit replacement problem as a Markov decision process and utilize the realtime signal observations to determine a replacement policy. ...
We focus on exponentially increasing degradation signals and show that the optimal replacement policy for this class of problems is a monotonically nondecreasing control limit policy. ...
Kharoufeh from the University of Pittsburgh for their extensive feedback and helpful insights, which helped in strengthening this paper and aiding in its publication. ...
doi:10.1287/opre.1110.0912
fatcat:q64stckqtraj7bumhssvh26q2a
Partially Observed, Multi-objective Markov Games
[article]
2014
arXiv
pre-print
This leader-follower assumption allows the POMG to be transformed into a specially structured, partially observed Markov decision process (POMDP). ...
The problem is described by an infinite horizon, partially observed Markov game (POMG). ...
This assumption allows the POMG to be converted into a partially observed Markov decision process (POMDP). ...
arXiv:1404.4388v1
fatcat:p5d6v6627vca3kplycwufrxpri
Cooperative navigation for heterogeneous autonomous vehicles via approximate dynamic programming
2011
IEEE Conference on Decision and Control and European Control Conference
avoidance constraints and searching for stationary and mobile targets. ...
The mobile sensor network consists of a set of robotic sensors modeled as hybrid systems with processing capabilities. ...
ACKNOWLEDGMENTS This work was supported by NSF ECCS grant #1027775, and by the Department of Energy URPR Grant #DE-FG52-04NA25590. ...
doi:10.1109/cdc.2011.6161127
dblp:conf/cdc/FerrariAFL11
fatcat:ycaulpimpfhvho5qgxipybjpjq
Cooperative Multiagent Deep Deterministic Policy Gradient (CoMADDPG) for Intelligent Connected Transportation with Unsignalized Intersection
2020
Mathematical Problems in Engineering
Unsignalized intersection control is one of the most critical issues in intelligent transportation systems, which requires connected and automated vehicles to support more frequent information interaction ...
with the scenario of unsignalized intersection control. ...
Cooperative Multiagent Deep Deterministic Policy Gradient In this paper, partially observable Markov games are considered, constituting a multiagent Markov decision process. e possible state S, a set of ...
doi:10.1155/2020/1820527
fatcat:opkcxvn5vbhbfciytszlfmz7iy
Simulation-based optimization of Markov decision processes: An empirical process theory approach
2010
Automatica
The goal of this paper is to extend the reach of this rich and rapidly developing theory to Markov decision processes and Multiarmed bandits problems, and use this framework to solve the optimal policy ...
We generalize and build on the PAC Learning framework for Markov Decision Processes developed in Jain and Varaiya (2006) . We consider the reward function to depend on both the state and the action. ...
We propose an empirical process theory approach to simulation-based optimization of Markov decision processes. ...
doi:10.1016/j.automatica.2010.05.021
fatcat:liylwmxl3ngijcfnanpdntcjai
Anomaly detection using projective Markov models in a distributed sensor network
2009
Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference
The paper develops application of techniques from robust and universal hypothesis testing for anomaly detection and change-point detection in dynamic, interconnected systems. ...
This theory is extended using the concept of projected Markov models originally proposed by Claude Shannon. ...
Multiple Models and Partial Information 1) Partial and Distributed Information: Suppose that we observe only a few function of the process Z. ...
doi:10.1109/cdc.2009.5400612
dblp:conf/cdc/MeynSLN09
fatcat:kkemnuftang47jnisyuirirwkq
Eleventh conference on stochastic processes and their applications
1984
Stochastic Processes and their Applications
Filrrring ami stochastic control
Optimal control of Markov Processes
Arie Hordijk, Unicarsity of Leidcn, The Netherlands Firstly we consider Markov decision chains with a denumerable state space and ...
Problems of this type often arise in connection with Markov decision processes. ...
doi:10.1016/0304-4149(84)90173-x
fatcat:wanpovevh5bltlkl7ngbxbi2ga
A geometric optimization approach to tracking maneuvering targets using a heterogeneous mobile sensor network
2009
Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference
The targets are modeled by a Markov motion process that is commonly used in target tracking applications. ...
Since the sensors are installed on mobile robots and have limited range, the geometry of their platforms and fields-of-view play a critical role in motion planning and obstacle avoidance. ...
ACKNOWLEDGMENTS This work is supported in part by the Office of Naval Research (Code 321), and by NSF grant ECS CAREER #0448906. The work of R. ...
doi:10.1109/cdc.2009.5400166
dblp:conf/cdc/FerrariFT09
fatcat:jrmx4atf4zeozb7b7gt6xxnoku
Acquiring state from control dynamics to learn grasping policies for robot hands
2002
Advanced Robotics
For grasping and manipulation, we propose a closed-loop control process that is parametric in the number and identity of contact resources. ...
A grasp controller can thus be tuned on-line to optimize performance over a variety of object geometries. ...
Acknowledgements This work was supported in part by the National Science Foundation under grants CISE /CDA-9703217, IRI-9704530 and IRI-9503687. ...
doi:10.1163/15685530260182927
fatcat:bizzxnp43jculbnp3zaoznuu2m
Partially Observable Markov Decision Processes: A Geometric Technique and Analysis
2010
Operations Research
This paper presents a novel framework for studying partially observable Markov decision processes (POMDPs) with finite state, action, observation sets, and discounted rewards. ...
It reveals the connection between the POMDP problem and two computational geometry problems, i.e., finding the vertices of a convex hull and finding the Minkowski sum of convex polytopes, which can help ...
Acknowledgments The author thanks the associate editor and two anonymous referees for their constructive suggestions that improved the exposition of this paper, and Mahesh Nagarajan for his helpful comments ...
doi:10.1287/opre.1090.0697
fatcat:rmzqivhlhbg55e3euejxmsqlti
« Previous
Showing results 1 — 15 out of 2,912 results