Reinforcement Learning Algorithms: Survey and Classification

N. R. Ravishankar, M. V. Vijayakumar
2017 Indian Journal of Science and Technology  
Under Reinforcement Learning it is very well known, there are 2 broad classifications as Model-based and Model-free RL 3 . Model-based RLs have the knowledge about the environment in which the agent acts, and about the agent, per se, as well. The state transition-action mapping combined with the reward model is available a-priori. That means the agent knows the environment in which it is acting; it knows the state transitions very well -that is P(s '|s, a). It also has the reward matrix
more » ... e. The agent's job is to find an optimal policy from a given state to the goal-state -that is expected total reward. Whichever path has the best overall utility that is considered as the optimal policy. Some of the governing equations in model-based RLs are: Utility Eq.: U = E [∑γtR(St)] Bellman's Eq.: U π (s) = R(s) + γ ∑s'P(s'|s, π(s)) U π (s') Where γ is the discount factor.
doi:10.17485/ijst/2017/v10i1/109385 fatcat:n7fsqz5mxfbetlopeplp2h2pke