A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
TIDBD: Adapting Temporal-difference Step-sizes Through Stochastic Meta-descent
[article]
2018
arXiv
pre-print
In this paper, we introduce a method for adapting the step-sizes of temporal difference (TD) learning. The performance of TD methods often depends on well chosen step-sizes, yet few algorithms have been developed for setting the step-size automatically for TD learning. An important limitation of current methods is that they adapt a single step-size shared by all the weights of the learning system. A vector step-size enables greater optimization by specifying parameters on a per-feature basis.
arXiv:1804.03334v1
fatcat:vspu4e3mg5dw3okjbfdj2mybie