3,746 Hits in 5.7 sec

Solving POMDPs by Searching in Policy Space [article]

Eric A. Hansen
2013 arXiv   pre-print
This paper presents an approach to solving POMDPs that represents a policy explicitly as a finite-state controller and iteratively improves the controller by search in policy space.  ...  Most algorithms for solving POMDPs iteratively improve a value function that implicitly represents a policy and are said to search in value function space.  ...  Support for this work was provided in part by the National Science Foundation under grants IRI-9624992, IRI-9634938 and INT-9612092.  ... 
arXiv:1301.7380v1 fatcat:kfpp7ml6ovgvnaqwtrewjfbd5m

Partially Observable Markov Decision Processes [chapter]

Thomas Zeugmann, Pascal Poupart, James Kennedy, Xin Jin, Jiawei Han, Lorenza Saitta, Michele Sebag, Jan Peters, J. Andrew Bagnell, Walter Daelemans, Geoffrey I. Webb, Kai Ming Ting (+12 others)
2011 Encyclopedia of Machine Learning  
Macro- actions • MCVI essentially unchanged when used with macro- actions • Macro-actions can in turn be constructed by solving a simpler POMDP • Collaborative Search and Capture -MCVI: reward  ...  • Online search POMDP Online Search • A policy needs to work well on essentially the whole of an optimal reachable space -In the worst case, there is no small policy • May sometimes still be able to do  ... 
doi:10.1007/978-0-387-30164-8_629 fatcat:hj6hnbjtn5fshpq4jnxpsaj7dq

MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs [article]

Daniel Szer, Francois Charpillet, Shlomo Zilberstein
2012 arXiv   pre-print
We present multi-agent A* (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partially-observable Markov decision problems (DEC-POMDPs) with finite horizon.  ...  Solving such problems efiectively is a major challenge in the area of planning under uncertainty. Our solution is based on a synthesis of classical heuristic search and decentralized control theory.  ...  MAA* searches in policy space: evaluation and exploration is based on policy vectors.  ... 
arXiv:1207.1359v1 fatcat:qlhernresjc7hoy77gwctnyneq

VDCBPI: an Approximate Scalable Algorithm for Large POMDPs

Pascal Poupart, Craig Boutilier
2004 Neural Information Processing Systems  
and the policy space complexity.  ...  This paper describes a new algorithm (VDCBPI) that mitigates both sources of intractability by combining the Value Directed Compression (VDC) technique [13] with Bounded Policy Iteration (BPI) [14] .  ...  Bounded Policy Iteration with Value-Directed Compression In principle, any POMDP algorithm can be used to solve the compressed POMDPs produced by VDC.  ... 
dblp:conf/nips/PoupartB04 fatcat:yqrw5fbqqrbqlihmqekuy6n7xi

An approximate algorithm for solving oracular POMDPs

Nicholas Armstrong-Crews, Manuela Veloso
2008 2008 IEEE International Conference on Robotics and Automation  
to the size of the state and observation spaces, thereby showing rigorously that OPOMDPs are "easier" than POMDPs.  ...  We propose a new approximate algorithm, LA-JIV (Lookahead J-MDP Information Value), to solve Oracular Partially Observable Markov Decision Problems (OPOMDPs), a special type of POMDP that rather than standard  ...  Hence, the POMDP is equivalent to an MDP in the continuous belief space simplex (called the belief MDP), and so can be solved by dynamic programming [12] .  ... 
doi:10.1109/robot.2008.4543721 dblp:conf/icra/Armstrong-CrewsV08 fatcat:4iwejeh7fnbhdnb74hpcgutmfm

Solving POMDPs by Searching the Space of Finite Policies [article]

Nicolas Meuleau, Kee-Eung Kim, Leslie Pack Kaelbling, Anthony R. Cassandra
2013 arXiv   pre-print
Solving partially observable Markov decision processes (POMDPs) is highly intractable in general, at least in part because the optimal policy may be infinitely large.  ...  In this paper, we explore the problem of finding the optimal policy from a restricted set of policies, represented as finite state automata of a given size.  ...  In this case, one may whish to restrict further the search space by imposing structural constraints on the policy graph.  ... 
arXiv:1301.6720v1 fatcat:rq2dryw4frfxba3hkb2chu42k4

Partially Observable Markov Decision Processes (POMDPs) and Robotics [article]

Hanna Kurniawati
2021 arXiv   pre-print
However, since early 2000, POMDPs solving capabilities have advanced tremendously, thanks to sampling-based approximate solvers.  ...  This paper presents a review of POMDPs, emphasizing computational issues that have hindered its practicality in robotics and ideas in sampling-based solvers that have alleviated such difficulties, together  ...  ACKNOWLEDGMENTS This work is supported by the ANU Futures Scheme.  ... 
arXiv:2107.07599v1 fatcat:lu7zmlpqjrfvbbawvjtnr45pey

Extending the Applicability of POMDP Solutions to Robotic Tasks

Devin K. Grady, Mark Moll, Lydia E. Kavraki
2015 IEEE Transactions on robotics  
Determining an approximately optimal action policy for POMDPs is PSPACE-complete, and the exponential growth of computation time prohibits solving large tasks.  ...  We empirically demonstrate the performance gain provided by these two techniques through simulated execution in a variety of environments.  ...  Search and Rescue In our results for the search and rescue tasks, it is clear from overall reward (Figure 9 , center row) and success rates (Figure 9 , bottom row) that the policies computed by MCVI  ... 
doi:10.1109/tro.2015.2441511 fatcat:rzcjhfmmabc6bkv6le6bxironi

Decision Making in Complex Multiagent Contexts: A Tale of Two Frameworks

Prashant J. Doshi
2012 The AI Magazine  
I put the two frameworks, decentralized partially observable Markov decision process (Dec-POMDP) and the interactive partially observable Markov decision process (I-POMDP), in context and review the foundational  ...  I conclude by examining the avenues that research pertaining to these frameworks is pursuing.  ...  Acknowledgments This work was supported in part by NSF CAREER grant, #IIS-0845036, and in part by a grant from the U.S. Air Force, #FA9550-08-1-0429.  ... 
doi:10.1609/aimag.v33i4.2402 fatcat:peqlr3rr5bghffao6zjowl7amy

An Optimal Best-First Search Algorithm for Solving Infinite Horizon DEC-POMDPs [chapter]

Daniel Szer, François Charpillet
2005 Lecture Notes in Computer Science  
In the domain of decentralized Markov decision processes, we develop the first complete and optimal algorithm that is able to extract deterministic policy vectors based on finite state controllers for  ...  We believe this to be an important step forward in learning and planning in stochastic multi-agent systems.  ...  Best-first Search for Infinite Horizon DEC-POMDPs Solving Markov decision problems usually involves maximizing an evaluation function in either state space or policy space, with our approach being an example  ... 
doi:10.1007/11564096_38 fatcat:zrfjvrga7vhenht3dgi2po6yda

My Brain is Full: When More Memory Helps [article]

Christopher Lusena, Tong Li, Shelia Sittinger, Chris Wells, Judy Goldsmith
2013 arXiv   pre-print
We compare run times of each policy and of a dynamic programming algorithm for POMDPs developed by Hansen that iteratively improves a finite-state controller --- the previous state of the art for finite  ...  The policies considered are em free finite-memory policies with limited memory; a policy is a mapping from the space of observation-memory pairs to the space of action-memeory pairs (the policy updates  ...  Acknowledgements This research supported in part by NSF grant CCR-9610348.  ... 
arXiv:1301.6715v1 fatcat:yduhxtnhzbapxlbo6sbrmzwp5y

pomdp_py: A Framework to Build and Solve POMDP Problems [article]

Kaiyu Zheng, Stefanie Tellex
2020 arXiv   pre-print
In this paper, we present pomdp_py, a general purpose Partially Observable Markov Decision Process (POMDP) library written in Python and Cython.  ...  We also describe intuitive integration of this library with ROS (Robot Operating System), which enabled our torso-actuated robot to perform object search in 3D.  ...  We developed a novel approach to model and solve an OO-POMDP for the task of multi-object search in 3D.  ... 
arXiv:2004.10099v1 fatcat:htricloicvgpdiwkwubonqlm6i

Pre- and post-contact policy decomposition for planar contact manipulation under uncertainty

Michael C. Koval, Nancy S. Pollard, Siddhartha S. Srinivasa
2015 The international journal of robotics research  
Our method uses an offline point-based solver on a variableresolution discretization of the state space to solve for a postcontact policy as a pre-computation step.  ...  We demonstrate that it is intractable to solve the full POMDP with traditional techniques and introduce a novel decomposition of the policy into pre-and post-contact stages to reduce the computational  ...  ACKNOWLEDGMENTS This work was supported by a NASA Space Technology Research Fellowship.  ... 
doi:10.1177/0278364915594474 fatcat:frjv2dro4ze6vay4sai5ns6tiy

Pre- and Post-Contact Policy Decomposition for Planar Contact Manipulation Under Uncertainty

Michael Koval, Nancy Pollard, Siddhartha Srinivasa
2014 Robotics: Science and Systems X  
Our method uses an offline point-based solver on a variableresolution discretization of the state space to solve for a postcontact policy as a pre-computation step.  ...  We demonstrate that it is intractable to solve the full POMDP with traditional techniques and introduce a novel decomposition of the policy into pre-and post-contact stages to reduce the computational  ...  ACKNOWLEDGMENTS This work was supported by a NASA Space Technology Research Fellowship.  ... 
doi:10.15607/rss.2014.x.034 dblp:conf/rss/KovalPS14 fatcat:xsgv2um4zbfitdftgr4rmainbu

Point-Based Methods for Model Checking in Partially Observable Markov Decision Processes [article]

Maxime Bouton, Jana Tumova, Mykel J. Kochenderfer
2020 arXiv   pre-print
We propose a methodology to synthesize policies that satisfy a linear temporal logic formula in a partially observable Markov decision process (POMDP).  ...  We demonstrate that our method scales to large POMDP domains and provides strong bounds on the performance of the resulting policy.  ...  Acknowledgment This work was supported by the Honda Research Institute. The authors thank Sebastian Junges, Nils Jansen, and Emma Brunskill for their advice on the early stages of this work.  ... 
arXiv:2001.03809v1 fatcat:s7jaksgjpze7hashe2h3g5rdsy
« Previous Showing results 1 — 15 out of 3,746 results