A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2014; you can also visit the original URL.
The file type is application/pdf
.
Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule
2007
Neural Computation
Learning agents, whether natural or artificial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plasticity is influenced by an environmental signal, termed a reward, that directs the changes in appropriate directions. We apply a recently introduced policy learning algorithm from machine learning to networks of spiking neurons and derive a spike-time-dependent plasticity rule that ensures convergence to a local optimum of the
doi:10.1162/neco.2007.19.8.2245
pmid:17571943
fatcat:sodvrgjedzhfnh57hg5uspem7q