A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Convergence of Reinforcement Learning with General Function Approximators
1999
International Joint Conference on Artificial Intelligence
A key open problem in reinforcement learning is to assure convergence when using a compact hypothesis class to approximate the value function. Although the standard temporal-difference learning algorithm has been shown to converge when the hypothesis class is a linear combination of fixed basis functions, it may diverge with a general (nonlinear) hypothesis class. This paper describes the Bridge algorithm, a new method for reinforcement learning, and shows that it converges to an approximate
dblp:conf/ijcai/PapavassiliouR99
fatcat:46sgqekcvnf2zj47pezc6er5ry