A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Intelligent Model Learning Based on Variance for Bayesian Reinforcement Learning
2015
2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI)
We consider a modular method to reinforcement learning that represents uncertainty of model parameters by maintaining probability distributions over them. The algorithm we call MBDP (model-based Bayesian dynamic programming) can be decomposed into two parallel types of inference: model learning and policy learning. During learning a model, we update posterior distributions of a model over observations after taking an action in each state. During learning a policy, we solve MDPs by dynamic
doi:10.1109/ictai.2015.37
dblp:conf/ictai/YouLZWZ15
fatcat:d4k6tjqk3reetfbhg5pqy5gn2u