Filters








260 Hits in 5.3 sec

Good-for-MDPs Automata for Probabilistic Analysis and Reinforcement Learning [article]

Ernst Moritz Hahn, Mateo Perez, Fabio Somenzi, Ashutosh Trivedi, Sven Schewe, Dominik Wojtczak
<span title="2019-10-30">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We characterize the class of nondeterministic ω-automata that can be used for the analysis of finite Markov decision processes (MDPs). We call these automata 'good-for-MDPs' (GFM).  ...  show that going beyond limit-deterministic automata may significantly benefit reinforcement learning.  ...  Evaluation General Büchi Automata for Probabilistic Model Checking As discussed, automata that simulate slim automata or SLDBAs are good for MDPs.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1909.05081v2">arXiv:1909.05081v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/bto4m7ybe5gc7btkhsbdmaew64">fatcat:bto4m7ybe5gc7btkhsbdmaew64</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200830053451/https://arxiv.org/pdf/1909.05081v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/78/13/7813db5e2f8c531f3a574e394578fd730ab35593.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1909.05081v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Good-for-MDPs Automata for Probabilistic Analysis and Reinforcement Learning [chapter]

Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, Dominik Wojtczak
<span title="">2020</span> <i title="Springer International Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
We characterize the class of nondeterministic ω-automata that can be used for the analysis of finite Markov decision processes (MDPs). We call these automata 'good-for-MDPs' (GFM).  ...  that going beyond limit-deterministic automata may significantly benefit reinforcement learning.  ...  GFM Automata and Reinforcement Learning SLDBAs have been used in [12] for model-free reinforcement learning of ω-regular objectives.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-030-45190-5_17">doi:10.1007/978-3-030-45190-5_17</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/kicbhvwrwvfvfpjnwyxhkzf4ui">fatcat:kicbhvwrwvfvfpjnwyxhkzf4ui</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200510014548/https://link.springer.com/content/pdf/10.1007%2F978-3-030-45190-5_17.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f9/c7/f9c76a6a5ff86326d19a191abac5a97af575e98c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-030-45190-5_17"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

The 10,000 Facets of MDP Model Checking [chapter]

Christel Baier, Holger Hermanns, Joost-Pieter Katoen
<span title="">2019</span> <i title="Springer International Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
This paper presents a retrospective view on probabilistic model checking. We focus on Markov decision processes (MDPs, for short).  ...  We survey the basic ingredients of MDP model checking and discuss its enormous developments since the seminal works by Courcoubetis and Yannakakis in the early 1990s.  ...  Learning has also been applied to continuous-time MDP (using gradient ascent) [22] . The use of automata learning techniques for probabilistic models [124] is also an interesting future direction.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-91908-9_21">doi:10.1007/978-3-319-91908-9_21</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/yjsuwb5ibjff3cq3niatu6sbxq">fatcat:yjsuwb5ibjff3cq3niatu6sbxq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200325180215/https://link.springer.com/content/pdf/10.1007%2F978-3-319-91908-9_21.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/1e/78/1e7881be6feef302efa3218eee75e48219f43206.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-91908-9_21"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Omega-Regular Objectives in Model-Free Reinforcement Learning [chapter]

Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, Dominik Wojtczak
<span title="">2019</span> <i title="Springer International Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
We provide the first solution for model-free reinforcement learning of ω-regular objectives for Markov decision processes (MDPs).  ...  Our approach allows us to apply model-free, off-theshelf reinforcement learning algorithms to compute optimal strategies from the observations of the MDP.  ...  ] to be suitable for both qualitative and quantitative analysis of MDPs under all ω-regular objectives.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-030-17462-0_27">doi:10.1007/978-3-030-17462-0_27</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/rfxrmhlb3ne7tbfbcch64ucz34">fatcat:rfxrmhlb3ne7tbfbcch64ucz34</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190505082628/https://link.springer.com/content/pdf/10.1007%2F978-3-030-17462-0_27.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a8/0c/a80cf29c7bb688252a51269fe23a1aa6b7da646c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-030-17462-0_27"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Omega-Regular Objectives in Model-Free Reinforcement Learning [article]

Ernst Moritz Hahn and Mateo Perez and Sven Schewe and Fabio Somenzi and Ashutosh Trivedi and Dominik Wojtczak
<span title="2018-09-26">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We provide the first solution for model-free reinforcement learning of ω-regular objectives for Markov decision processes (MDPs).  ...  Our approach allows us to apply model-free, off-the-shelf reinforcement learning algorithms to compute optimal strategies from the observations of the MDP.  ...  suitable for both qualitative and quantitative analysis of MDPs under all ω-regular objectives.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1810.00950v1">arXiv:1810.00950v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ezm3djqsbfdf5iypljzfbxug4i">fatcat:ezm3djqsbfdf5iypljzfbxug4i</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200913063901/https://arxiv.org/pdf/1810.00950v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f1/6b/f16ba1c7f51d72169b54da43b8b343f9a74dc725.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1810.00950v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Deep Statistical Model Checking [chapter]

Timo P. Gros, Holger Hermanns, Jörg Hoffmann, Michaela Klauck, Marcel Steinmetz
<span title="">2020</span> <i title="Springer International Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
, "how good is the NN compared to the optimal policy?" (obtained by model checking the MDP), or "does further training improve the NN?".  ...  Neither is the verification technology available, nor is it even understood what a formal, meaningful, extensible, and scalable testbed might look like for such a technology.  ...  This work was partially supported by ERC Advanced Investigators Grant 695614 (POWVER), and by DFG Grant 389792660 as part of TRR 248 (CPEC). The authors thank Felix Freiberger for technical support.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-030-50086-3_6">doi:10.1007/978-3-030-50086-3_6</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/hqnjedbyendnbkmogkituzdjim">fatcat:hqnjedbyendnbkmogkituzdjim</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200709165049/https://link.springer.com/content/pdf/10.1007%2F978-3-030-50086-3_6.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/87/19/87197e00f4dc4cd99d575e5eb4756944a1bc13dd.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-030-50086-3_6"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Formal Controller Synthesis for Continuous-Space MDPs via Model-Free Reinforcement Learning [article]

Abolfazl Lavaei, Fabio Somenzi, Sadegh Soudjani, Ashutosh Trivedi, and Majid Zamani
<span title="2020-03-02">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
A novel reinforcement learning scheme to synthesize policies for continuous-space Markov decision processes (MDPs) is proposed.  ...  This scheme enables one to apply model-free, off-the-shelf reinforcement learning algorithms for finite MDPs to compute optimal strategies for the corresponding continuous-space MDPs without explicitly  ...  results for reinforcement learning on finite-state MDPs.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2003.00712v1">arXiv:2003.00712v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/7chk4gejqbcshfvzgxgbylamzm">fatcat:7chk4gejqbcshfvzgxgbylamzm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200322204647/https://arxiv.org/pdf/2003.00712v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2003.00712v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Learning-Based Mean-Payoff Optimization in an Unknown MDP under Omega-Regular Constraints

Jan Kretínský, Guillermo A. Pérez, Jean-François Raskin, Michael Wagner
<span title="2018-08-13">2018</span> <i > <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/iv4yo5vao5ctfjfushi4akt5xi" style="color: black;">International Conference on Concurrency Theory</a> </i> &nbsp;
(i) For all ε and γ we can construct an online-learning finite-memory strategy that almost-surely satisfies the parity objective and which achieves an ε-optimal mean payoff with probability at least 1  ...  (ii) Alternatively, for all ε and γ there exists an online-learning infinite-memory strategy that satisfies the parity objective surely and which achieves an ε-optimal mean payoff with probability at least  ...  Reinforcement-learning (RL, for short) algorithms for partially-specified Markov decision processes (MDPs) have been proposed (see e.g. [32, 22, 26, 28] ) to learn strategies that reach (near-)optimal  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.4230/lipics.concur.2018.8">doi:10.4230/lipics.concur.2018.8</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/concur/KretinskyPR18.html">dblp:conf/concur/KretinskyPR18</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/j6fszayujnd4rku6wa4o4puux4">fatcat:j6fszayujnd4rku6wa4o4puux4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220120123649/https://drops.dagstuhl.de/opus/volltexte/2018/9546/pdf/LIPIcs-CONCUR-2018-8.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ca/9b/ca9b236bc2a842482a9411cf203c233f7bcb10dc.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.4230/lipics.concur.2018.8"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Maximum Causal Entropy Specification Inference from Demonstrations [chapter]

Marcell Vazquez-Chanlatte, Sanjit A. Seshia
<span title="">2020</span> <i title="Springer International Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
However, most methods for learning from demonstrations either do not provide guarantees that the learned artifacts can be safely composed or do not explicitly capture temporal properties.  ...  This work continues this line of research by adapting maximum causal entropy inverse reinforcement learning to estimate the posteriori probability of a specification given a multi-set of demonstrations  ...  We would like to thank the anonymous referees as well as Daniel Fremont, Ben Caulfield, Marissa Ramirez de Chanlatte, Gil Lederman, Dexter Scobee, and Hazem Torfah for their useful suggestions and feedback  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-030-53291-8_15">doi:10.1007/978-3-030-53291-8_15</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cvghjcxtrjh7pplm6fkv4ffrem">fatcat:cvghjcxtrjh7pplm6fkv4ffrem</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200716053004/https://link.springer.com/content/pdf/10.1007%2F978-3-030-53291-8_15.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c4/43/c4436173012b2aaa4add390502584fce8c3d02e0.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-030-53291-8_15"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> springer.com </button> </a>

Maximum Causal Entropy Specification Inference from Demonstrations [article]

Marcell Vazquez-Chanlatte, Sanjit A. Seshia
<span title="2020-05-16">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
., robotics) demonstrations provide a natural way to specify tasks; however, most methods for learning from demonstrations either do not provide guarantees that the artifacts learned for the tasks, such  ...  This work continues this line of research by adapting maximum causal entropy inverse reinforcement learning to estimate the posteriori probability of a specification given a multi-set of demonstrations  ...  Acknowledgments: We would like to thank the anonymous referees as well as Daniel Fremont, Ben Caulfield, Marissa Ramirez de Chanlatte, Gil Lederman, Dexter Scobee, and Hazem Torfah for their useful suggestions  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1907.11792v5">arXiv:1907.11792v5</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ecojp4yjqrg45avypy3bg5mr2m">fatcat:ecojp4yjqrg45avypy3bg5mr2m</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200520003600/https://arxiv.org/pdf/1907.11792v5.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/53/42/5342a7031f5a323e3504fb155f82cf12d4453d3b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1907.11792v5" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Model Checking Linear-Time Properties of Probabilistic Systems [chapter]

Christel Baier, Marcus Größer, Frank Ciesinski
<span title="">2009</span> <i title="Springer Berlin Heidelberg"> Monographs in Theoretical Computer Science </i> &nbsp;
, control theory, reinforcement learning, economics, manufacturing, and semantics of randomized protocols.  ...  This renders the state space explosion problem even more serious than in the non-probabilistic case and the feasibility of algorithms for the quantitative analysis crucially depends on good heuristics  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-642-01492-5_13">doi:10.1007/978-3-642-01492-5_13</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/n3qjkl2ojreypn7sxc4e7zfvs4">fatcat:n3qjkl2ojreypn7sxc4e7zfvs4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170810170415/http://www.dcc.fc.up.pt/~nam/web/resources/vfs12/exame/Baier.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ce/a0/cea0e9d7d6a6504a5c3c9d9c9a07e63c41a90086.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-642-01492-5_13"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

A Learning Automata based Solution for Optimizing Dialogue Strategy in Spoken Dialogue System

G. Kumaravelan, R. Sivakumar
<span title="2012-11-15">2012</span> <i title="Foundation of Computer Science"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/b637noqf3vhmhjevdfk3h5pdsu" style="color: black;">International Journal of Computer Applications</a> </i> &nbsp;
Compared to other baseline reinforcement learning methods the proposed approach exhibits a better performance with regard to the learning speed, good exploration/exploitation in its update and robustness  ...  In spoken dialogue system, Markov Decision Processes (MDPs) provide a formal framework for making dialogue management decisions for planning.  ...  Learning Automata Learning Automata (LA) are adaptive decision-making devices operating on unknown random environment, and are associated with a finite set of actions and each action has a certain probability  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5120/9310-3541">doi:10.5120/9310-3541</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ube5fvqiufbkdp7lmx2hi7nmbq">fatcat:ube5fvqiufbkdp7lmx2hi7nmbq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170814231558/http://research.ijcaonline.org/volume58/number9/pxc3883541.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/7e/7c/7e7cea142d9579937c4b643fb7741ed1f670a668.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5120/9310-3541"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

A storm is Coming: A Modern Probabilistic Model Checker [article]

Christian Dehnert and Sebastian Junges and Joost-Pieter Katoen and Matthias Volk
<span title="2017-02-14">2017</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We launch the new probabilistic model checker storm. It features the analysis of discrete- and continuous-time variants of both Markov chains and MDPs.  ...  It supports the PRISM and JANI modeling languages, probabilistic programs, dynamic fault trees and generalized stochastic Petri nets.  ...  The authors would like to thank people that support(ed) the development of Storm over the years (in alphabetical order): Philipp Berger, Harold Bruintjes, Gereon Kremer, David Korzeniewski, and Tim Quatmann  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1702.04311v1">arXiv:1702.04311v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/5oazkten7zg4zghea7wqf3s27q">fatcat:5oazkten7zg4zghea7wqf3s27q</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200911181749/https://arxiv.org/pdf/1702.04311v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/19/e0/19e04ed2d7df1a8eaad707e19cff081cbe495d7c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1702.04311v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Markov Abstractions for PAC Reinforcement Learning in Non-Markov Decision Processes [article]

Alessandro Ronca, Gabriel Paludo Licks, Giuseppe De Giacomo
<span title="2022-05-18">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We show that Markov abstractions can be learned during reinforcement learning. Our approach combines automata learning and classic reinforcement learning.  ...  Our work aims at developing reinforcement learning algorithms that do not rely on the Markov assumption.  ...  Given a decision process P and a required accuracy > 0, Episodic Reinforcement Learning (RL) for P and is the problem of an agent that has to learn anoptimal policy for P from the data it collects by interacting  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2205.01053v2">arXiv:2205.01053v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/fkok2co36zcbzlbicob5xpe2te">fatcat:fkok2co36zcbzlbicob5xpe2te</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220525114350/https://arxiv.org/pdf/2205.01053v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a9/3b/a93ba32f67c1486bbf163ac00e409eae45469723.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2205.01053v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Learning-Based Mean-Payoff Optimization in an Unknown MDP under Omega-Regular Constraints [article]

Jan Křetínský, Guillermo A. Pérez, Jean-François Raskin
<span title="2018-08-23">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
(i) For all ϵ and γ we can construct an online-learning finite-memory strategy that almost-surely satisfies the parity objective and which achieves an ϵ-optimal mean payoff with probability at least 1  ...  (ii) Alternatively, for all ϵ and γ there exists an online-learning infinite-memory strategy that satisfies the parity objective surely and which achieves an ϵ-optimal mean payoff with probability at least  ...  Safety-constrained reinforcement learning for mdps.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1804.08924v4">arXiv:1804.08924v4</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/wdz4hyx7obhxdcep2bbxd2dx6a">fatcat:wdz4hyx7obhxdcep2bbxd2dx6a</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200828030616/https://arxiv.org/pdf/1804.08924v4.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/15/b0/15b0d178303a4d9289fea6dd6b02a36e1da2a1c8.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1804.08924v4" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 260 results