Evaluation of the hypothesis that phasic dopamine constitutes a cached-value signal

Melissa J. Sharpe, Geoffrey Schoenbaum
2018 Neurobiology of Learning and Memory  
The phasic dopamine error signal is currently argued to be synonymous with the prediction error in Sutton and Barto (1987 Barto ( , 1998 model-free reinforcement learning algorithm (Schultz et al., 1997) . This theory argues that phasic dopamine reflects a cached-value signal that endows rewardpredictive cues with the scalar value inherent in reward. Such an interpretation does not envision a role for dopamine in more complex cognitive representations between events which underlie many forms of
more » ... associative learning, restricting the role dopamine can play in learning. The cached-value hypothesis of dopamine makes three concrete predictions about when a phasic dopamine response should be seen and what types of learning this signal should be able to promote. We discuss these predictions in light of recent evidence which we believe provide particularly strong tests of their validity. In doing so, we find that while the phasic dopamine signal conforms to a cached-value account in some circumstances, other evidence demonstrate that this signal is not restricted to a model-free cached-value reinforcement learning signal. In light of this evidence, we argue that the phasic dopamine signal functions more generally to signal violations of expectancies to drive realworld associations between events.
doi:10.1016/j.nlm.2017.12.002 pmid:29269085 pmcid:PMC6136434 fatcat:btizu5lptzblpck3yucifwykmm