10,660 Hits in 4.8 sec

Verified Probabilistic Policies for Deep Reinforcement Learning [article]

Edoardo Bacci, David Parker
2022 arXiv   pre-print
We propose an abstraction approach, based on interval Markov decision processes, that yields probabilistic guarantees on a policy's execution, and present techniques to build and solve these models using  ...  abstract interpretation, mixed-integer linear programming, entropy-based refinement and probabilistic model checking.  ...  To build abstractions, we use interval Markov decision processes (IMDPs). Definition 2 (Interval Markov decision process).  ... 
arXiv:2201.03698v1 fatcat:6q6tle2d45aphn6gicqci7h5f4

Temporal logic control of general Markov decision processes by approximate policy refinement

Sofie Haesaert, Sadegh Soudjani, Alessandro Abate
2018 IFAC-PapersOnLine  
The formal verification and controller synthesis for Markov decision processes that evolve over uncountable state spaces are computationally hard and thus generally rely on the use of approximations.  ...  In this work, we consider the correct-by-design control of general Markov decision processes (gMDPs) with respect to temporal logic properties by leveraging approximate probabilistic relations between  ...  General Markov decision processes and control strategies General Markov decision processes extend upon Markov decision processes (Bertsekas and Shreve, 1996) and are formalised next. Definition 1.  ... 
doi:10.1016/j.ifacol.2018.08.013 fatcat:3vej5xkzizbg5mvdvor5wufnya

Analysing Decisive Stochastic Processes

Nathalie Bertrand, Patricia Bouyer, Thomas Brihaye, Pierre Carlier, Marc Herbstritt
2016 International Colloquium on Automata, Languages and Programming  
For instance, the approximate quantitative reachability problem can be solved for decisive Markov chains (enjoying reasonable effectiveness assumptions) including probabilistic lossy channel systems and  ...  This allows us to obtain decidability results for both qualitative and quantitative verification problems on some classes of real-time stochastic processes, including generalized semi-Markov processes  ...  Also, the approximation scheme for reachability properties can be adapted to evaluate an expected accumulated reward, provided the reward evolves linearly in the model, as in Markov reward models [5,  ... 
doi:10.4230/lipics.icalp.2016.101 dblp:conf/icalp/0001BBC16 fatcat:brpldifmvjd2dpc5vhrvaal6ky

Probabilistic Guarantees for Safe Deep Reinforcement Learning [article]

Edoardo Bacci, David Parker
2020 arXiv   pre-print
Our approach is based on the iterative construction of a formal abstraction of a controller's execution in an environment, and leverages probabilistic model checking of Markov decision processes to produce  ...  We implement and evaluate our approach on agents trained for several benchmark control problems.  ...  This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No. 834115, FUN2MODEL).  ... 
arXiv:2005.07073v2 fatcat:wfngzaajozfdfnjwi3nxdiz5ei

Performance analysis of probabilistic action systems

Stefan Hallerstede, Michael Butler
2004 Formal Aspects of Computing  
Numerical methods solving the optimisation problems posed by Markov decision processes are well-known, and used in a software tool that we have developed.  ...  A corresponding notion of refinement and simulation-based proof rules are introduced. Probabilistic action systems are based on discrete-time Markov decision processes.  ...  The latter being particularly important for optimal systems that solve the Markov decision process associated with a probabilistic action system.  ... 
doi:10.1007/s00165-004-0037-6 fatcat:mjyysgl2gje3zn6nbnvsca75se

An Anytime Algorithm for Task and Motion MDPs [article]

Siddharth Srivastava, Nishant Desai, Richard Freedman, Shlomo Zilberstein
2018 arXiv   pre-print
We present a new approach where the high-level decision problem occurs in a stochastic setting and can be modeled as a Markov decision process.  ...  Integrated task and motion planning has emerged as a challenging problem in sequential decision making, where a robot needs to compute high-level strategy and low-level motion plans for solving complex  ...  Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the DARPA or SSC Pacific.  ... 
arXiv:1802.05835v1 fatcat:ljxnukoqjvd2bh4c3zpjoc5wke

Computation techniques for large scale undiscounted markov decision processes

Thom J. Hodgson, Gary J. Koehler
1979 Naval Research Logistics Quarterly  
21 ABSTRACT Computation Techniques for Large Scale Undiscounted Markov Decision Processes  ...  We now briefly turn our attention to the continuous time Markov decision process and the semi-Markov decision process (which itself subsumes both the continuous and discrete time models as special cases  ... 
doi:10.1002/nav.3800260404 fatcat:xhzjyxgnpzfqzp5uvclfezj6ey

Solving Hybrid Markov Decision Processes [chapter]

Alberto Reyes, L. Enrique Sucar, Eduardo F. Morales, Pablo H. Ibargüengoytia
2006 Lecture Notes in Computer Science  
Markov decision processes (MDPs) have developed as a standard for representing uncertainty in decision-theoretic planning.  ...  In this paper a reward-based abstraction for solving hybrid MDPs is presented.  ...  Acknowledgments This work was supported in part by IIE Project No. 12941 and CONACYT Project No. 47968.  ... 
doi:10.1007/11925231_22 fatcat:pfd65gbga5ayzcet7bgwd5xm3a

Perspectives in Probabilistic Verification

Joost-Pieter Katoen
2008 2008 2nd IFIP/IEEE International Symposium on Theoretical Aspects of Software Engineering  
This paper surveys the main achievements during the last two decades, reports on recent advances, and attempts to point out some research challenges for the coming years. 2nd IFIP/IEEE International Symposium  ...  -but later efficient algorithms were developed for quantitative questions as well.  ...  My research is supported by the EU FP7 QUASIMODO project, the NWO projects MC=MC and QUPES, as well as the DFG research training group ALGOSYN.  ... 
doi:10.1109/tase.2008.44 dblp:conf/tase/Katoen08 fatcat:34t7l56fwfep3psrn42cqe2m4m

Robust Dynamic Programming for Temporal Logic Control of Stochastic Systems [article]

Sofie Haesaert, Sadegh Soudjani
2018 arXiv   pre-print
Firstly, robust dynamic programming mappings over the abstract system are introduced to solve the control synthesis and verification problem.  ...  For this class of models, methods for the formal verification and synthesis of control strategies are computationally hard and generally rely on the use of approximate abstractions.  ...  ACKNOWLEDGEMENT The authors would like to acknowledge Alessandro Abate for his contributions to the preceding article presented at the ADHS conference [12] .  ... 
arXiv:1811.11445v1 fatcat:zzkwsdnbqrbzpfqneafydnjlta

Advances and challenges of probabilistic model checking

Marta Kwiatkowska, Gethin Norman, David Parker
2010 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton)  
Probabilistic model checking is a powerful technique for formally verifying quantitative properties of systems that exhibit stochastic behaviour.  ...  In this paper, we give a short overview of probabilistic model checking and of PRISM (, currently the leading software tool in this area.  ...  The authors are supported in part by EPSRC projects EP/D07956X and EP/F001096, EU FP7 project CONNECT and ERC Advanced Grant VERIWARE.  ... 
doi:10.1109/allerton.2010.5707120 fatcat:bauounyehvelpe4dwjrobeugfe

Advances in Probabilistic Model Checking [chapter]

Joost-Pieter Katoen
2010 Lecture Notes in Computer Science  
The first half of the tutorial concerns two classical probabilistic models, discrete-time Markov chains and Markov decision processes, explaining the underlying theory and model checking algorithms for  ...  The second half discusses two advanced topics: quantitative abstraction refinement and model checking for probabilistic timed automata.  ...  The authors are part supported by ERC Advanced Grant VERIWARE, EPSRC grant EP/F001096/1 and EU-FP7 project CONNECT.  ... 
doi:10.1007/978-3-642-11319-2_5 fatcat:o2vlzsyzbferjhexkbbdcct5vq

Fast stochastic motion planning with optimality guarantees using local policy reconfiguration

Ryan Luna, Morteza Lahijanian, Mark Moll, Lydia E. Kavraki
2014 2014 IEEE International Conference on Robotics and Automation (ICRA)  
The motion of the system is abstracted to a class of uncertain Markov models known as bounded-parameter Markov decision processes (BMDPs).  ...  During the abstraction, an efficient sampling-based method for stochastic optimal control is used to construct several policies within a discrete region of the state space in order for the system to transit  ...  Vardi for his helpful discussions and insights, as well as Ryan Christiansen and the other Kavraki Lab members for valuable input on this work.  ... 
doi:10.1109/icra.2014.6907293 dblp:conf/icra/LunaLMK14 fatcat:3zev3k23dvbazmf735bbwnfbru

Online Abstraction with MDP Homomorphisms for Deep Learning [article]

Ondrej Biza, Robert Platt
2019 arXiv   pre-print
Abstraction of Markov Decision Processes is a useful tool for solving complex problems, as it can ignore unimportant aspects of an environment, simplifying the process of learning an optimal policy.  ...  In this paper, we propose a new algorithm for finding abstract MDPs in environments with continuous state spaces. It is based on MDP homomorphisms, a structure-preserving mapping between MDPs.  ...  In the context of the Markov Decision Process (MDP), state abstraction can be understood using an elegant approach known as the MDP homomorphism framework [13] .  ... 
arXiv:1811.12929v2 fatcat:bikmm7oi5ra5tff4ofgrphrzpm

Guided search for task and motion plans using learned heuristics

Rohan Chitnis, Dylan Hadfield-Menell, Abhishek Gupta, Siddharth Srivastava, Edward Groshev, Christopher Lin, Pieter Abbeel
2016 2016 IEEE International Conference on Robotics and Automation (ICRA)  
Our contributions are as follows: 1) we present a complete algorithm for TAMP; 2) we present a randomized local search algorithm for plan refinement that is easily formulated as a Markov decision process  ...  We present an algorithm that searches the space of possible task and motion plans and uses statistical machine learning to guide the search process.  ...  ACKNOWLEDGMENTS This research was funded in part by the Intel Science and Technology Center (ISTC) on Robotics and Embedded Systems.  ... 
doi:10.1109/icra.2016.7487165 dblp:conf/icra/ChitnisHGSGLA16 fatcat:5ny4jxztrfbvjguogt4rtd57li
« Previous Showing results 1 — 15 out of 10,660 results