Kernelizing LSPE(λ)

T. Jung, D. Polani
2007 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning  
We propose the use of kernel-based methods as underlying function approximator in the least-squares based policy evaluation framework of LSPE(λ) and LSTD(λ). In particular we present the 'kernelization' of model-free LSPE(λ). The 'kernelization' is computationally made possible by using the subset of regressors approximation, which approximates the kernel using a vastly reduced number of basis functions. The core of our proposed solution is an efficient recursive implementation with automatic
more » ... pervised selection of the relevant basis functions. The LSPE method is well-suited for optimistic policy iteration and can thus be used in the context of online reinforcement learning. We use the high-dimensional Octopus benchmark to demonstrate this.
doi:10.1109/adprl.2007.368208 fatcat:gtzy3noz75bmpa65txbgurdlhi