An Extension of Finite-state Markov Decision Process and an Application of Grammatical Inference [chapter]

Takeshi Shibata, Ryo Yoshinak
2008 Reinforcement Learning  
Reinforcement Learning: Theory and Applications 86 positive data, the one computing a most general grammar and the modified update equations of some usual reinforcement learning methods. Notation and definitions Before we give the definition of simple context-free MDPs, we write some standard notation and definitions and introduce subclasses of simple grammars and probabilistic grammars. A context-free grammar (CFG) is a quadruple denoted by , where V is a finite set of nonterminal symbols, Σ
more » ... a finite set of terminal symbols, R ⊂ V×(V ∪ Σ)* is a finite set of production rules, and S∈V is the start symbol. Let G= be a CFG. We write XAZ G ⇒ XYZ if there is a rule A→Y. When G is clearly identified, we write simply ⇒ instead of G ⇒ . *
doi:10.5772/5276 fatcat:2rtphxd6sne2jftvmxqc3zf3qi