Filters








109 Hits in 3.6 sec

Strategy Complexity of Mean Payoff, Total Payoff and Point Payoff Objectives in Countable MDPs [article]

Richard Mayr, Eric Munday
2021 arXiv   pre-print
Point payoff (the sequence of directly seen transition rewards), 2. Total payoff (the sequence of the sums of all rewards so far), and 3. Mean payoff.  ...  We establish the complete picture of the strategy complexity of these objectives, i.e., how much memory is necessary and sufficient for ε-optimal (resp. optimal) strategies.  ...  Upper bounds We establish upper bounds on the strategy complexity of lim inf threshold objectives for mean payoff, total payoff and point payoff.  ... 
arXiv:2107.03287v2 fatcat:pthifguwufgrnkmima2bealqsy

Strategy Complexity of Point Payoff, Mean Payoff and Total Payoff Objectives in Countable MDPs [article]

Richard Mayr, Eric Munday
2022 arXiv   pre-print
Point payoff (the sequence of directly seen transition rewards), 2. Mean payoff (the sequence of the sums of all rewards so far, divided by the number of steps), and 3.  ...  We establish the complete picture of the strategy complexity of these objectives, i.e., how much memory is necessary and sufficient for ε-optimal (resp. optimal) strategies.  ...  Conclusion and Outlook We have established matching lower and upper bounds on the strategy complexity of lim inf threshold objectives for point, total and mean payoff on countably infinite MDPs; cf.  ... 
arXiv:2203.07079v1 fatcat:xbqxrrcjh5hxxhotnodltaxabq

Life is Random, Time is Not: Markov Decision Processes with Window Objectives [article]

Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, Mickael Randour
2020 arXiv   pre-print
We develop a generic approach for window-based objectives and instantiate it for the classical mean-payoff and parity objectives, already considered in games.  ...  correctness of behaviors in the limit.  ...  We prove these complexities to be almost tight (Thm. 4.8), the most interesting case being the PSPACE-hardness of DFW mean-payoff objectives, even in the case of acyclic MDPs.  ... 
arXiv:1901.03571v5 fatcat:wlzlix64ancuxbz2m2yvzt6hqu

Life Is Random, Time Is Not: Markov Decision Processes with Window Objectives

Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, Mickael Randour, Michael Wagner
2019 International Conference on Concurrency Theory  
We develop a generic approach for window-based objectives and instantiate it for the classical mean-payoff and parity objectives, already considered in games.  ...  processes, window mean-payoff, window parity  ...  We prove these complexities to be almost tight (Thm. 5), the most interesting case being the PSPACE-hardness of DFW mean-payoff objectives, even in the case of acyclic MDPs.  ... 
doi:10.4230/lipics.concur.2019.8 dblp:conf/concur/BrihayeDOR19 fatcat:v4qxzfut3rfuhg3tdpkqr3ctpm

Life is Random, Time is Not: Markov Decision Processes with Window Objectives

Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, Mickael Randour
2019 Logical Methods in Computer Science  
We develop a generic approach for window-based objectives and instantiate it for the classical mean-payoff and parity objectives, already considered in games.  ...  correctness of behaviors in the limit.  ...  We prove these complexities to be almost tight (Thm. 4.8), the most interesting case being the PSPACE-hardness of DFW mean-payoff objectives, even in the case of acyclic MDPs.  ... 
doi:10.23638/lmcs-16(4:13)2020 fatcat:ty7g5sr2tver5gipe2nzqmpgxi

Conditional Value-at-Risk for Reachability and Mean Payoff in Markov Decision Processes [article]

Jan Křetínský, Tobias Meggendorfer
2018 arXiv   pre-print
We present the conditional value-at-risk (CVaR) in the context of Markov chains and Markov decision processes with reachability and mean-payoff objectives.  ...  We derive lower and upper bounds on the computational complexity of the respective decision problems and characterize the structure of the strategies in terms of memory and randomization.  ...  We thank Vojtěch Forejt for bringing up the topic of CVaR and the initial discussions with Jan Krčál and wish them both happy life in industry.  ... 
arXiv:1805.02946v1 fatcat:5p4vkm7hgzgw3cpsrnt25elfzu

Measuring and Synthesizing Systems in Probabilistic Environments [article]

Krishnendu Chatterjee and Thomas A. Henzinger and Barbara Jobstmann and Rohit Singh
2011 arXiv   pre-print
For general omega-regular specifications, the solution rests on a new, polynomial-time algorithm for computing optimal strategies in MDPs with mean-payoff parity objectives.  ...  For safety specifications and measures given by mean-payoff automata, the optimal-synthesis problem amounts to finding a strategy in a Markov decision process (MDP) that is optimal for a long-run average  ...  In contrast to MDPs with mean-payoff objectives, where pure memoryless optimal strategies exist, optimal strategies for mean-payoff parity objectives in MDPs require infinite memory.  ... 
arXiv:1004.0739v2 fatcat:cqcx5xhqaza5jpnbk7rnh53oxu

Trading performance for stability in Markov decision processes

Tomáš Brázdil, Krishnendu Chatterjee, Vojtěch Forejt, Antonín Kučera
2017 Journal of computer and system sciences (Print)  
We study controller synthesis problems for finite-state Markov decision processes, where the objective is to optimize the expected mean-payoff performance and stability (also known as variability in the  ...  We show that a strategy ensuring both the expected mean payoff and the variance below given bounds requires randomization and memory, under both the above definitions.  ...  In the formal verification area, MDPs with multiple mean-payoff objectives [2] , discounted objectives [9] , cumulative reward objectives [17] , and multiple ω-regular objectives [13] have been studied  ... 
doi:10.1016/j.jcss.2016.09.009 fatcat:gdoidcod7fbmpclzw3nyrqacmy

Trading Performance for Stability in Markov Decision Processes

Toma Brazdil, Krishnendu Chatterjee, Vojtech Forejt, Antonin Kucera
2013 2013 28th Annual ACM/IEEE Symposium on Logic in Computer Science  
We study controller synthesis problems for finite-state Markov decision processes, where the objective is to optimize the expected mean-payoff performance and stability (also known as variability in the  ...  We show that a strategy ensuring both the expected mean payoff and the variance below given bounds requires randomization and memory, under both the above definitions.  ...  In the formal verification area, MDPs with multiple mean-payoff objectives [2] , discounted objectives [9] , cumulative reward objectives [17] , and multiple ω-regular objectives [13] have been studied  ... 
doi:10.1109/lics.2013.39 dblp:conf/lics/BrazdilCFK13 fatcat:vup5je3tpfco5cebezezclsmpi

Learning-Based Mean-Payoff Optimization in an Unknown MDP under Omega-Regular Constraints [article]

Jan Křetínský, Guillermo A. Pérez, Jean-François Raskin
2018 arXiv   pre-print
We formalize the problem of maximizing the mean-payoff value with high probability while satisfying a parity objective in a Markov decision process (MDP) with unknown probabilistic transition function  ...  of guarantees on the parity and mean-payoff objectives can be achieved depending on how much memory one is willing to use.  ...  Following this approach, in [1], Almagor et al. study MDPs equipped with a mean-payoff and parity objective.  ... 
arXiv:1804.08924v4 fatcat:wdz4hyx7obhxdcep2bbxd2dx6a

Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes

Tomas Br´zdil, V´clav Brozek, Krishnendu Chatterjee, Vojtech Forejt, Antonin Kucera
2011 2011 IEEE 26th Annual Symposium on Logic in Computer Science  
Our results also reveal flaws in previous work for MDPs with multiple mean-payoff functions under the expectation objective, correct the flaws and obtain improved results.  ...  in the size of the MDP and 1 ε 1 ε 1 ε , and exponential in the number of reward functions, for all ε > 0 ε > 0 ε > 0.  ...  Under the expectation objective with mean-payoff function, neither is there any immediate notion of "product" of MDP and mean-payoff function and nor do memoryless strategies suffice.  ... 
doi:10.1109/lics.2011.10 dblp:conf/lics/BrazdilBCFK11 fatcat:n44uqwtekjbhji7zrcsfudmhtu

Robust Equilibria in Concurrent Games [article]

Romain Brenguier
2016 arXiv   pre-print
We study the problem of finding robust equilibria in multiplayer concurrent games with mean payoff objectives.  ...  Robust equilibria in mean-payoff games reduce to winning strategies in multidimensional mean-payoff games for some threshold satisfying some constraints.  ...  The aim of this article is to characterise robust equilibria in order to construct the corresponding strategies, and precisely describe the complexity of the following decision problem for mean-payoff  ... 
arXiv:1311.7683v7 fatcat:spekr7yzy5bfhnu2bbjpxs7chu

Markov Decision Processes with Multiple Long-Run Average Objectives [chapter]

Krishnendu Chatterjee
2007 Lecture Notes in Computer Science  
We study Markov decision processes (MDPs) with multiple limit-average (or mean-payoff) functions. We consider two different objectives, namely, expectation and satisfaction objectives.  ...  Our analysis also reveals flaws in previous work for MDPs with multiple mean-payoff functions under the expectation objective, corrects the flaws, and allows us to obtain improved results. 2012 ACM CCS  ...  Forejt is supported by a Royal Society Newton Fellowship and EPSRC project EP/J012564/1.  ... 
doi:10.1007/978-3-540-77050-3_39 fatcat:w4gwffqaqfez3j6dllwqc4q5kq

Markov Decision Processes with Multiple Long-run Average Objectives

Tomáš Brázdil, Václav Brožek, Krishnendu Chatterjee, Vojtěch Forejt, Antonín Kučera, Stephan Kreutzer
2014 Logical Methods in Computer Science  
We study Markov decision processes (MDPs) with multiple limit-average (or mean-payoff) functions. We consider two different objectives, namely, expectation and satisfaction objectives.  ...  Our analysis also reveals flaws in previous work for MDPs with multiple mean-payoff functions under the expectation objective, corrects the flaws, and allows us to obtain improved results.  ...  Forejt is supported by a Royal Society Newton Fellowship and EPSRC project EP/J012564/1.  ... 
doi:10.2168/lmcs-10(1:13)2014 fatcat:ez7cpqd6abhxxjavehb6aonnxm

Trading Performance for Stability in Markov Decision Processes [article]

Tomáš Brázdil, Krishnendu Chatterjee, Vojtěch Forejt, Antonín Kučera
2013 arXiv   pre-print
We study the complexity of central controller synthesis problems for finite-state Markov decision processes, where the objective is to optimize both the expected mean-payoff performance of the system and  ...  We show that a strategy ensuring both the expected mean-payoff and the variance below given bounds requires randomization and memory, under all the above semantics of variance.  ...  executed in a run ω and mp(ω) is the mean-payoff of ω.  ... 
arXiv:1305.4103v1 fatcat:v4enu4vn6fe2hgjludf5k4ahqm
« Previous Showing results 1 — 15 out of 109 results