Convergence of synchronous reinforcement learning with linear function approximation

Artur Merke, Ralf Schoknecht
2004 Twenty-first international conference on Machine learning - ICML '04  
Synchronous reinforcement learning (RL) algorithms with linear function approximation are representable as inhomogeneous matrix iterations of a special form (Schoknecht & Merke, 2003) . In this paper we state conditions of convergence for general inhomogeneous matrix iterations and prove that they are both necessary and sufficient. This result extends the work presented in (Schoknecht & Merke, 2003) , where only a sufficient condition of convergence was proved. As the condition of convergence
more » ... on of convergence is necessary and sufficient, the new result is suitable to prove convergence and divergence of RL algorithms with function approximation. We use the theorem to deduce a new concise proof of convergence for the synchronous residual gradient algorithm (Baird, 1995) . Moreover, we derive a counterexample for which the uniform RL algorithm (Merke & Schoknecht, 2002) diverges. This yields a negative answer to the open question if the uniform RL algorithm converges for arbitrary multiple transitions.
doi:10.1145/1015330.1015390 dblp:conf/icml/MerkeS04 fatcat:ovaahz7wzrbgbofwckam2iggxm