Filters








27,391 Hits in 9.3 sec

Improved Iterative Methods for Verifying Markov Decision Processes [chapter]

Jaber Karimpour, Ayaz Isazadeh, MohammadSadegh Mohagheghi, Khayyam Salehi
2015 Lecture Notes in Computer Science  
Value and policy iteration are powerful methods for verifying quantitative properties of Markov Decision Processes (MDPs). In order to accelerate these methods many approaches have been proposed.  ...  Experimental results show that they don't work much better than normal value/policy iteration when the graph of the MDP is dense.  ...  Introduction Markov Decision Processes (MDPs) are transition systems that can be used for modeling both nondeterministic and stochastic behaviors of reactive systems.  ... 
doi:10.1007/978-3-319-24644-4_14 fatcat:6r57nlda6ngwdolweipz4ok7pu

Binary LDPC Codes Decoding Algorithm Based on MRF and FPGA Implementation

Zhongxun Wang, Wenqiang Wu
2016 TELKOMNIKA (Telecommunication Computing Electronics and Control)  
No matter hard decision or soft decision LDPC code decoding algorithm, we can get all ring number by one test, instead of testing each long ring number, after optimizing ring detection algorithm.  ...  The improved LDPC code decoding algorithm mainly refereed to improving decoding performance or reducing the decoding computation complexity.  ...  So it can reduce the computational complexity and improve detection speed. Parameter Estimation Method For parameter model, parameter estimation is very important.  ... 
doi:10.12928/telkomnika.v14i4.4158 fatcat:rysvrrtbkbfy3pmim4egwocb34

Successive approximations for Markov decision processes and Markov games with unbounded rewards

Jo Van Nunen, Jaap Wbssels
1979 Mathematische Operationsforschung und Statistik Series Optimization  
The a~m of this paper is to give an overview of recent developments in the area of successive approximations for Markov decision processes and Markov games.  ...  The general presentation ~s given for Markov decision processes with a final 4Itection devoted to the possibilities of extension to Markov games.  ...  Introduction In recent years quite a lot of research effort has been dedicated to successive ap~roximations methods in Markov decision processes and Markov games for the total expected reward as well as  ... 
doi:10.1080/02331937908842597 fatcat:e6yeqbx6ijeb3ebf2dtum72kpy

SEMI-MARKOV DECISION PROBLEMS AND PERFORMANCE SENSITIVITY ANALYSIS

Xi-Ren Cao
2002 IFAC Proceedings Volumes  
In particular, we show that performance sensitivity formulas and policy iteration algorithms of semi-Markov decision processes (SMDPs) can be derived based on performance potential and realization matrix  ...  First, we develop PA theory for semi-Markov processes (SMPs); and second, we extend the aforementioned results about the relation among PA, MDP, and RL to SMPs.  ...  With realization matrix and performance potential, sensitivity formulas and policy iteration algorithms of semi-Markov decision processes (SMDPs) can be derived easily in the same way as for Markov processes  ... 
doi:10.3182/20020721-6-es-1901.00511 fatcat:vhoqus356fatzhdv7lcnctjg2a

Semi-markov decision problems and performance sensitivity analysis

Xi-Ren Cao
2003 IEEE Transactions on Automatic Control  
In particular, we show that performance sensitivity formulas and policy iteration algorithms of semi-Markov decision processes (SMDPs) can be derived based on performance potential and realization matrix  ...  First, we develop PA theory for semi-Markov processes (SMPs); and second, we extend the aforementioned results about the relation among PA, MDP, and RL to SMPs.  ...  With realization matrix and performance potential, sensitivity formulas and policy iteration algorithms of semi-Markov decision processes (SMDPs) can be derived easily in the same way as for Markov processes  ... 
doi:10.1109/tac.2003.811252 fatcat:nxnsu4tcg5cgjpjmbznoxgazdq

CertRL: Formalizing Convergence Proofs for Value and Policy Iteration in Coq [article]

Koundinya Vajjha, Avraham Shinnar, Vasily Pestun, Barry Trager, Nathan Fulton
2020 arXiv   pre-print
This paper begins the work of closing this gap by developing a Coq formalization of two canonical reinforcement learning algorithms: value and policy iteration for finite state Markov decision processes  ...  The CertRL library provides a general framework for proving properties about Markov decision processes and reinforcement learning algorithms, paving the way for further work on formalization of reinforcement  ...  Markov Decision Processes We refer to [Put94] for detailed presentation of the theory of Markov decision processes.  ... 
arXiv:2009.11403v2 fatcat:ab5oua6w5fdtllvkyfxrowneo4

Improving Graph-based methods for computing qualitative properties of markov decision processes

Mohammadsadegh Mohagheghi, Khayyam Salehi
2020 Indonesian Journal of Electrical Engineering and Computer Science  
Iterative numerical methods are used to compute the reachability probabilities for the remaining states.  ...  In this paper, we focus on the graph-based pre-computations and propose a heuristic to improve the performance of these pre-computations.  ...  (Markov Decision Process) A Markov Decision Process (MDP) is defined as a tuple where is a finite set of states, is the initial state, is a finite set of actions.  ... 
doi:10.11591/ijeecs.v17.i3.pp1571-1577 fatcat:ugebc54wcvbo7gj2ypex7aydvi

An Analysis of Primal-Dual Algorithms for Discounted Markov Decision Processes [article]

Randy Cogill
2016 arXiv   pre-print
solving discounted-cost Markov decision processes.  ...  We will first show that several widely-used algorithms for Markov decision processes can be interpreted in terms of the primal-dual method, where the value function is updated with suboptimal solutions  ...  solving discounted-cost Markov decision processes.  ... 
arXiv:1601.04175v1 fatcat:hxyxogbaobe3xl2uy3sb7q6tlq

Ideological and Political Education Recommendation System Based on AHP and Improved Collaborative Filtering Algorithm

Nan Wang, Le Sun
2021 Scientific Programming  
Firstly, considering the time effect of student scoring, the recommendation model is transformed into Markov decision process.  ...  (AHP) and improved collaborative filtering algorithm.  ...  To optimize SVD++ recommendation model by reinforcement learning method, the mapping relationship between recommendation prediction model and Markov decision process should be established first.  ... 
doi:10.1155/2021/2648352 fatcat:6r7l6lxxnjfavacrl2h4ixmgca

Page 770 of Mathematical Reviews Vol. , Issue 81B [page]

1981 Mathematical Reviews  
E. van Nunen (Delft) Doshi, Bharat 81b:90149 Policy improvement algorithm for continuous time Markov decision processes with switching costs.  ...  E. van Nunen (Delft) Yasuda, Masami 81b:90152 Policy improvement in Markov decision processes and Markov potential theory. Bull. Math. Statist. 18 (1978/79), no. 1-2, 55-67.  ... 

Thematic issue on "bio-inspired learning for data analysis"

Yaochu Jin, Jinliang Ding, Yongsheng Ding
2017 Memetic Computing  
The second paper, "Thermal Image Colorization Using Markov Decision Processes" by Gu et al, employs the Markov Decision Processes (MDP) to deal with the computational complexity of the colorization problem  ...  This issue aims to present a collection of recent advances in learning and optimization using bio-inspired techniques and other learning methods.  ...  The second paper, "Thermal Image Colorization Using Markov Decision Processes" by Gu et al, employs the Markov Decision Processes (MDP) to deal with the computational complexity of the colorization problem  ... 
doi:10.1007/s12293-017-0223-8 fatcat:bxhcjjneevfodj3snnsy7fz3na

Reversible Markov Decision Processes with an Average-Reward Criterion

Randy Cogill, Cheng Peng
2013 SIAM Journal of Control and Optimization  
In this paper we study the structure of optimal control policies for Markov decision processes with reversible dynamics.  ...  For optimal control of Markov decision processes, significant analytical and computational simplifications can arise as a result of reversibility.  ...  We say that a Markov decision process is a reversible Markov decision process if under any state-feedback policy, the resulting Markov chain is reversible.  ... 
doi:10.1137/110844957 fatcat:v67ii46zsnedrjfbtxefxwjdbi

Drift and monotonicity conditions for continuous-time controlled markov chains with an average criterion

Xianping Guo, O. Hernandez-Lerma
2003 IEEE Transactions on Automatic Control  
, and under which the convergence of a policy iteration method is also shown.  ...  Index Terms-Average (or ergodic) reward/cost criterion, continuous-time controlled Markov chains (or continuous-time Markov decision processes), drift and monotonicity conditions, optimal stationary policy  ...  Let be an arbitrary "initial" stationary policy, and let be the sequence of stationary policies obtained by the above policy iteration method. . The improvement at each iteration is given by (5.9 ).  ... 
doi:10.1109/tac.2002.808469 fatcat:3evjxwmbdzgg3dv6msoifka2a4

A Modified Policy Iteration Algorithm for Discounted Reward Markov Decision Processes

Sanaa Chafik, Cherki Daoui
2016 International Journal of Computer Applications  
This paper presents a Modified Policy Iteration algorithm to compute an optimal policy for large Markov decision processes in the discounted reward criteria and under infinite horizon.  ...  The running time of the classical algorithms of the Markov Decision Process (MDP) typically grows linearly with the state space size, which makes them frequently intractable.  ...  MARKOV DECISION PROCESSES 2.1 Markov chains and stochastic processes A stochastic process is simply a collection of random variables t S indexed by time t.  ... 
doi:10.5120/ijca2016908033 fatcat:iufsvfkx4jb5vlae4hwvq5oroq

Dealing with uncertainty in verification of nondeterministic systems

Yamilet R. Serrano Llerena
2014 Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering - FSE 2014  
To address this problem, the goal of this research is to provide a method based on perturbation analysis for probabilistic model checking of nondeterministic systems which are modelled as Markov Decision  ...  Processes.  ...  Figure 1 : 1 Machine Replacement Problem modelled as a Markov Decision Process  ... 
doi:10.1145/2635868.2666598 dblp:conf/sigsoft/Llerena14 fatcat:olzqfqz4ijf2xe7rzjcephg4su
« Previous Showing results 1 — 15 out of 27,391 results