Filters








172 Hits in 4.7 sec

Sequential Convex Programming for the Efficient Verification of Parametric MDPs [article]

Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ivan Papusha, Hasan A. Poonawala, Ufuk Topcu
2017 arXiv   pre-print
This insight allows for a sequential optimization algorithm to efficiently compute sound but possibly suboptimal solutions. Each stage of this algorithm solves a geometric programming problem.  ...  Multi-objective verification problems of parametric Markov decision processes under optimality criteria can be naturally expressed as nonlinear programs.  ...  We discuss a general nonlinear programming formulation for the verification of parametric Markov decision processes (pMDPs).  ... 
arXiv:1702.00063v1 fatcat:7eqtbtjcfvbghmf4vljx2i7ipa

Scenario-Based Verification of Uncertain MDPs [article]

Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu
2020 arXiv   pre-print
We consider Markov decision processes (MDPs) in which the transition probabilities and rewards belong to an uncertainty set parametrized by a collection of random variables.  ...  The probability distributions for these random parameters are unknown.  ...  The work in [27, 25] consider the verification of MDPs with convex uncertainties.  ... 
arXiv:1912.11223v2 fatcat:u2p2b4kwczce7higtuo72uqvuy

Scenario-Based Verification of Uncertain MDPs [chapter]

Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu
2020 Lecture Notes in Computer Science  
We consider Markov decision processes (MDPs) in which the transition probabilities and rewards belong to an uncertainty set parametrized by a collection of random variables.  ...  The probability distributions for these random parameters are unknown.  ...  The work in [27, 25] consider the verification of MDPs with convex uncertainties.  ... 
doi:10.1007/978-3-030-45190-5_16 pmid:32754724 pmcid:PMC7402411 fatcat:n3goluiytjbedhpseq5vpqscfa

Scenario-Based Verification of Uncertain Parametric MDPs [article]

Thom S. Badings, Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu
2021 arXiv   pre-print
The problem is to compute the probability to satisfy a temporal logic specification within any concrete MDP that corresponds to a sample from these distributions.  ...  The number of samples required to obtain a high confidence on these bounds is independent of the number of states and the number of random parameters.  ...  Programming for the Efficient Verification of Parametric J. Artif. Intell. Res. 59, 229–264 (2017). DOI 10.1613/ MDPs. In: TACAS (2), LNCS, vol. 10206, pp. 133–150 jair.5242.  ... 
arXiv:2112.13020v1 fatcat:g7karwsgpnaxrbtcj3lem5s5ke

Convex Optimization for Parameter Synthesis in MDPs [article]

Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu
2021 arXiv   pre-print
The first approach exploits the so-called convex-concave procedure (CCP), and the second approach utilizes a sequential convex programming (SCP) method.  ...  Consequently, parametric MDPs (pMDPs) extend MDPs with transition probabilities that are functions over unspecified parameters.  ...  SEQUENTIAL CONVEX PROGRAMMING In this section, we discuss our second method, which is a sequential convex programming (SCP) approach with trust region constraints [23] - [25] .  ... 
arXiv:2107.00108v1 fatcat:xmbhqovctzamjgojyt62xj6chy

Synthesis in pMDPs: A Tale of 1001 Parameters [article]

Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu
2018 arXiv   pre-print
To deal with the NP-hardness of such problems, we exploit a convex-concave procedure (CCP) to iteratively obtain local optima.  ...  We show that this problem can be formulated as a quadratically-constrained quadratic program (QCQP) and is non-convex in general.  ...  The experiments showed that our method significantly improves the state-of-the-art. In the future, we will investigate how to automatically handle nonaffine transition functions.  ... 
arXiv:1803.02884v4 fatcat:5j34rtfjo5f4fnjfcdjjl6zshy

Adversarial Robustness Verification and Attack Synthesis in Stochastic Systems [article]

Lisa Oakley, Alina Oprea, Stavros Tripakis
2021 arXiv   pre-print
We find that the parametric solution results in fast computation for small parameter spaces.  ...  Probabilistic model checking is a useful technique for specifying and verifying properties of stochastic systems including randomized protocols and the theoretical underpinnings of reinforcement learning  ...  Acknowledgment This work has been supported by the National Science Foundation under NSF SaTC awards CNS-1717634 and CNS-1801546.  ... 
arXiv:2110.02125v1 fatcat:46qsnbmt6bcbxehhhrhfgxtid4

Optimising Partial-Order Plans Via Action Reinstantiation

Max Waters, Lin Padgham, Sebastian Sardina
2020 Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence  
This work investigates the problem of optimising a partial-order plan's (POP) flexibility through the simultaneous transformation of its action ordering and variable binding constraints.  ...  While the former has been extensively studied through the notions of deordering and reordering, the latter has received much less attention.  ...  Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence  ... 
doi:10.24963/ijcai.2020/569 dblp:conf/ijcai/Suilen0CT20 fatcat:tjh6yr57ovaw3d4ldiv4jjgzpi

Robust Policy Synthesis for Uncertain POMDPs via Convex Optimization [article]

Marnix Suilen, Nils Jansen, Murat Cubuktepe, Ufuk Topcu
2020 arXiv   pre-print
For uncertainty sets that form convex polytopes, we provide a transformation of the problem to a convex QCQP with finitely many constraints.  ...  The transition probability function of uPOMDPs is only known to belong to a so-called uncertainty set, for instance in the form of probability intervals.  ...  so-called parametric MDPs [Junges et al., 2019] .  ... 
arXiv:2001.08174v2 fatcat:jeig4nh7vvfrfbod6q5p55ey2a

Multi-objective Robust Strategy Synthesis for Interval Markov Decision Processes [chapter]

Ernst Moritz Hahn, Vahid Hashemi, Holger Hermanns, Morteza Lahijanian, Andrea Turrini
2017 Lecture Notes in Computer Science  
In this paper, we consider the problem of multi-objective robust strategy synthesis for interval MDPs, where the aim is to find a robust strategy that guarantees the satisfaction of multiple properties  ...  They provide a powerful modelling tool for probabilistic systems with an additional variation or uncertainty that prevents the knowledge of the exact transition probabilities.  ...  As regards to strategy synthesis algorithms, the works in [17, 29] considered synthesis for parametric MDP s and MDP s with ellipsoidal uncertainty in the verification community.  ... 
doi:10.1007/978-3-319-66335-7_13 fatcat:z4vt2tcamrhg3cr3vcqm6thauy

Automated Verification and Synthesis of Stochastic Hybrid Systems: A Survey [article]

Abolfazl Lavaei, Sadegh Soudjani, Alessandro Abate, Majid Zamani
2022 arXiv   pre-print
Automated verification and policy synthesis for stochastic hybrid systems can be inherently challenging: this is due to the heterogeneity of their dynamics (presence of continuous and discrete components  ...  In this survey, we overview the most recent results in the literature and discuss different approaches, including (in)finite abstractions, verification and synthesis for temporal logic specifications,  ...  We hope that this survey article provides an introduction to the foundations of SHS, towards an easier understanding of many challenges and existing solutions related to formal verification and control  ... 
arXiv:2101.07491v2 fatcat:dpir554ebfclhpj5m7e7fi2hv4

Transfer in inverse reinforcement learning for multiple strategies

Ajay Kumar Tanwani, Aude Billard
2013 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems  
Instead of learning from scratch every optimal policy in this set, the learner transfers knowledge from the set of learned policies to bootstrap its search for new optimal policy.  ...  We consider the problem of incrementally learning different strategies of performing a complex sequential task from multiple demonstrations of an expert or a set of experts.  ...  If the verification fails at any of the above three steps, π init is declared the optimal policy for w.  ... 
doi:10.1109/iros.2013.6696817 dblp:conf/iros/TanwaniB13 fatcat:irecdebdsfdxze2ifsp76xatbq

Robust control of uncertain Markov Decision Processes with temporal logic specifications

Eric M. Wolff, Ufuk Topcu, Richard M. Murray
2012 2012 IEEE 51st IEEE Conference on Decision and Control (CDC)  
A robust version of dynamic programming allows us to solve for a -suboptimal robust control policy with time complexity O(log1 ) times that for the non-robust case.  ...  A robust control policy for the MDP is generated that maximizes the worst-case probability of satisfying the specification over all transition probabilities in the uncertainty set.  ...  APPENDIX Theorem 2: The proof closely follows that of [5] . We add additional quantification over all possible probability distributions in (15) and (16) . First, partition the state space.  ... 
doi:10.1109/cdc.2012.6426174 dblp:conf/cdc/WolffTM12 fatcat:es3s4sunazekzi7cm2hdt3f7h4

Managing engineering systems with large state and action spaces through deep reinforcement learning [article]

C.P. Andriotis, K.G. Papakonstantinou
2018 arXiv   pre-print
Decision-making for engineering systems can be efficiently formulated as a Markov Decision Process (MDP) or a Partially Observable MDP (POMDP).  ...  Apart from deep function approximations that parametrize large state spaces, DCMAC also adopts a factorized representation of the system actions, being able to designate individualized component- and subsystem-level  ...  Acknowledgements This material is based upon work supported by the National Science Foundation under CAREER Grant No. 1751941.  ... 
arXiv:1811.02052v1 fatcat:foeooax7fvcrhbtxjpvcogzmqu

Task-Aware Verifiable RNN-Based Policies for Partially Observable Markov Decision Processes

Steven Carr, Nils Jansen, Ufuk Topcu
2021 The Journal of Artificial Intelligence Research  
Using such methods, if the Markov chain does not satisfy the specification, a byproduct of verification is diagnostic information about the states in the POMDP that are critical for the specification.  ...  Machine learning methods typically train recurrent neural networks (RNN) as effective representations of POMDP policies that can efficiently process sequential data.  ...  Acknowledgements Steven Carr and Ufuk Topcu were supported by the grants DARPA D19AP00004, ONR N00014-18-1-2829, and ARL ACC-APG-RTP W911NF. Nils Jansen was supported by the grant NWO OCENW.  ... 
doi:10.1613/jair.1.12963 fatcat:usbrnbs6dvarrbnj2x4bmmmrwa
« Previous Showing results 1 — 15 out of 172 results