A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Asymmetric and adaptive reward coding via normalized reinforcement learning
[article]
2021
bioRxiv
pre-print
At the neural level, diversity in asymmetries provides a computational mechanism for recently proposed theories of distributional RL, allowing the brain to learn the full probability distribution of future ...
This behavioral and computational flexibility argues for an incorporation of biologically valid value functions in computational models of learning and decision-making. ...
Together, these findings reconcile empirical and theoretical aspects of reinforcement learning, support the robustness of normalization-based value coding, and argue for the incorporation of biologically ...
doi:10.1101/2021.11.24.469880
fatcat:wf4nbalpmzeuriwvl2qjax4jwy
Dopamine signals for reward value and risk: basic and recent data
2010
Behavioral and Brain Functions
The neurons code reward value as it differs from prediction, thus fulfilling the basic requirement for a bidirectional prediction error teaching signal postulated by learning theory. ...
Expected reward value is a key decision variable for economic choices. The reward response codes reward value, probability and their summed product, expected value. ...
Acknowledgements This review was written on the occasion of the Symposium on Attention Deficit Hyperactivity Disorder (ADHD) in Oslo, Norway, February 2010. ...
doi:10.1186/1744-9081-6-24
pmid:20416052
pmcid:PMC2876988
fatcat:ek7pxmzwwvc4ncc5345bxkirfm
Dopamine: A Research Framework for Deep Reinforcement Learning
[article]
2018
arXiv
pre-print
Deep reinforcement learning (deep RL) research has grown significantly in recent years. A number of software offerings now exist that provide stable, comprehensive implementations for benchmarking. ...
At the same time, recent deep RL research has become more diverse in its goals. In this paper we introduce Dopamine, a new research framework for deep RL that aims to support some of that diversity. ...
offering of Dopamine focuses on value-based reinforcement learning applied to the Arcade Learning Environment. ...
arXiv:1812.06110v1
fatcat:ropxmfwnvzbnxbfunomxubselu
The biological and behavioral computations that influence dopamine responses
2018
Current Opinion in Neurobiology
Phasic dopamine responses demonstrate remarkable simplicity; they code for the differences between received and predicted reward values. ...
The application of optogenetics has provided evidence that dopamine reward prediction error responses cause value learning. ...
Such 'modelbased' learning can occur, for instance, during a reversal learning task (for a review of model-free vs model-based reinforcement learning, see [47] ). ...
doi:10.1016/j.conb.2018.02.005
pmid:29505948
pmcid:PMC6095465
fatcat:idmpav75qrgxrdw4ffjoihto5y
Dopamine neurons learn to encode the long-term value of multiple future rewards
2011
Proceedings of the National Academy of Sciences of the United States of America
The dopamine responses were quantitatively predicted by theoretical descriptions of the value function with time discounting in reinforcement learning. ...
If they play a critical role in achieving specific distant goals, long-term future rewards should also be encoded as suggested in reinforcement learning theories. ...
This study was supported by Grant-in-Aid 17022032 for Scientific Research on Priority Areas (to M.K.) and by the Development of Biomarker Candidates for Social Behavior carried out under the Strategic ...
doi:10.1073/pnas.1014457108
pmid:21896766
pmcid:PMC3174584
fatcat:5yzye7fxnjfuvc5ppvjaegm74y
Rare Rewards Amplify Dopamine Learning Responses
[article]
2019
bioRxiv
pre-print
For instance, reaction times and learning rates consistently reflect higher moments of probability distributions. Here, we demonstrate that dopamine RPE responses code probability distributions. ...
These results demonstrate that dopamine responses reflect probability distributions and suggest a neural mechanism for the amplified learning and enhanced arousal associated with rare events. ...
We used a reinforcement learning (RL) model to quantify the learned values (Supplementary Methods). ...
doi:10.1101/851709
fatcat:vvjo75fg2nbcxdhnd6l3gpoqya
Neuronal Reward and Decision Signals: From Theories to Data
2015
Physiological Reviews
Utility is the formal mathematical characterization of subjective value and a prime decision variable in economic choice theory. It is coded as utility prediction error by phasic dopamine responses. ...
Appropriate for formal decision mechanisms, rewards are coded as object value, action value, difference value, and chosen value by specific neurons. ...
free and model-based reinforcement learning, respectively. ...
doi:10.1152/physrev.00023.2014
pmid:26109341
pmcid:PMC4491543
fatcat:io5urzxu7rcttkaciaapilmkwe
How Reward and Aversion Shape Motivation and Decision Making: A Computational Account
2019
The Neuroscientist
In this review, we integrate historic findings on the behavioral and neural mechanisms of value-based decision making with recent, groundbreaking work in this area. ...
On the basis of this integrated view, we discuss a neuroeconomic framework of value-based decision making, use this to explain the motivation to pursue rewards and how motivation relates to the costs and ...
the lack of evidence for the OFC encoding value in a way that supports predictionerror based learning. ...
doi:10.1177/1073858419834517
pmid:30866712
fatcat:d5u5trr2tncnxinzilfg3mcf2m
DeepGraphMol, a multi-objective, computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approach
[article]
2020
bioRxiv
pre-print
Combinations of these terms, including drug likeness and synthetic accessibility, are then optimized using reinforcement learning based on a graph convolution policy approach. ...
We extend our method successfully to use a multi-objective reward function, in this case for generating novel molecules that bind with dopamine transporters but not with those for norepinephrine. ...
Reinforcement Learning setup Our model environment builds a molecule step by step with the addition of a new bond in each step. ...
doi:10.1101/2020.05.25.114165
fatcat:zfhmrsx6mzaibb4d626gkous5q
Beyond simple reinforcement learning: the computational neurobiology of reward-learning and valuation
2012
European Journal of Neuroscience
Neural computational accounts of reward-learning have been dominated by the hypothesis that dopamine neurons behave like a reward-prediction error and thus facilitate reinforcement learning in striatal ...
In this special issue of EJN we feature a combination of theoretical and experimental papers highlighting some of the explanatory challenges faced by simple reinforcement-learning models and describing ...
a putative role for this brain region in model-based learning. ...
doi:10.1111/j.1460-9568.2012.08074.x
pmid:22487029
fatcat:cqyjcvisjjhpvhiyeu5uzt2gia
Reinforcement learning, conditioning, and the brain: Successes and challenges
2009
Cognitive, Affective, & Behavioral Neuroscience
Sometimes, such reinforcement is not deterministic even given the triplet s, a, s ; in those cases, R(s, a, s ) is the expected value of the distribution of reinforcements when the agent is in state s, ...
coding. ...
Adaptive coding by dopamine neurons. Dopamine neurons do not seem to code the value of prediction errors in absolute terms. ...
doi:10.3758/cabn.9.4.343
pmid:19897789
fatcat:ovlgmuwlljeenhkjvix7lswqzy
Stimulation of the vagus nerve reduces learning in a go/no-go reinforcement learning task: Supporting Information
[article]
2019
biorxiv/medrxiv
pre-print
These results highlight a novel role of vagal afferent input in modulating reinforcement learning by tuning the learning rate according to homeostatic needs. ...
Computational reinforcement learning models identified the cause of this as a reduction in the learning rate through tVNS (∆α = -0.092, pboot = .002), particularly after punishment (∆αPun = -0.081, pboot ...
Effects of the conditions were modeled by predicting if a given choice (Bernoulli distribution) was correct based on the regressors go (dummy coded), win (dummy coded), and the interaction term go ✕ win ...
doi:10.1101/535260
fatcat:ldc3mgyvirfuhgkoh2vzcsmtza
Reinforcement learning: Computational theory and biological mechanisms
2007
HFSP Journal
Reinforcement learning is a computational framework for an active agent to learn behaviors on the basis of a scalar reward signal. ...
The theory of reinforcement learning, which was developed in an artificial intelligence community with intuitions from animal learning theory, is now giving a coherent account on the function of the basal ...
Based on the state representation in the cortex, the striatum learns state and action value functions. ...
doi:10.2976/1.2732246/10.2976/1
pmid:19404458
pmcid:PMC2645553
fatcat:k7o4vytl5vf53jy4cc2sf4xzfi
Reinforcement learning: Computational theory and biological mechanisms
2007
HFSP Journal
Reinforcement learning is a computational framework for an active agent to learn behaviors on the basis of a scalar reward signal. ...
The theory of reinforcement learning, which was developed in an artificial intelligence community with intuitions from animal learning theory, is now giving a coherent account on the function of the basal ...
Based on the state representation in the cortex, the striatum learns state and action value functions. ...
doi:10.2976/1.2732246
pmid:19404458
pmcid:PMC2645553
fatcat:frdfeyvw4bdahdc6qphd4yslru
Dopamine, Inference, and Uncertainty
[article]
2017
bioRxiv
pre-print
A coherent explanation for these deviations can be obtained by analyzing the dopamine response in terms of Bayesian reinforcement learning. ...
This account can explain dopamine responses to inferred value in sensory preconditioning, the effects of cue pre-exposure (latent inhibition) and adaptive coding of prediction errors when rewards vary ...
Acknowledgments This research was supported by the NSF Collaborative Research in Computational Neuroscience (CRCNS) Program Grant IIS-120 7833. ...
doi:10.1101/149849
fatcat:3k5riglaxnea5ad433bwgelyp4
« Previous
Showing results 1 — 15 out of 7,237 results