Filters








7,237 Hits in 13.1 sec

Asymmetric and adaptive reward coding via normalized reinforcement learning [article]

Kenway Louie
2021 bioRxiv   pre-print
At the neural level, diversity in asymmetries provides a computational mechanism for recently proposed theories of distributional RL, allowing the brain to learn the full probability distribution of future  ...  This behavioral and computational flexibility argues for an incorporation of biologically valid value functions in computational models of learning and decision-making.  ...  Together, these findings reconcile empirical and theoretical aspects of reinforcement learning, support the robustness of normalization-based value coding, and argue for the incorporation of biologically  ... 
doi:10.1101/2021.11.24.469880 fatcat:wf4nbalpmzeuriwvl2qjax4jwy

Dopamine signals for reward value and risk: basic and recent data

Wolfram Schultz
2010 Behavioral and Brain Functions  
The neurons code reward value as it differs from prediction, thus fulfilling the basic requirement for a bidirectional prediction error teaching signal postulated by learning theory.  ...  Expected reward value is a key decision variable for economic choices. The reward response codes reward value, probability and their summed product, expected value.  ...  Acknowledgements This review was written on the occasion of the Symposium on Attention Deficit Hyperactivity Disorder (ADHD) in Oslo, Norway, February 2010.  ... 
doi:10.1186/1744-9081-6-24 pmid:20416052 pmcid:PMC2876988 fatcat:ek7pxmzwwvc4ncc5345bxkirfm

Dopamine: A Research Framework for Deep Reinforcement Learning [article]

Pablo Samuel Castro, Subhodeep Moitra, Carles Gelada, Saurabh Kumar, Marc G. Bellemare
2018 arXiv   pre-print
Deep reinforcement learning (deep RL) research has grown significantly in recent years. A number of software offerings now exist that provide stable, comprehensive implementations for benchmarking.  ...  At the same time, recent deep RL research has become more diverse in its goals. In this paper we introduce Dopamine, a new research framework for deep RL that aims to support some of that diversity.  ...  offering of Dopamine focuses on value-based reinforcement learning applied to the Arcade Learning Environment.  ... 
arXiv:1812.06110v1 fatcat:ropxmfwnvzbnxbfunomxubselu

The biological and behavioral computations that influence dopamine responses

WR Stauffer
2018 Current Opinion in Neurobiology  
Phasic dopamine responses demonstrate remarkable simplicity; they code for the differences between received and predicted reward values.  ...  The application of optogenetics has provided evidence that dopamine reward prediction error responses cause value learning.  ...  Such 'modelbased' learning can occur, for instance, during a reversal learning task (for a review of model-free vs model-based reinforcement learning, see [47] ).  ... 
doi:10.1016/j.conb.2018.02.005 pmid:29505948 pmcid:PMC6095465 fatcat:idmpav75qrgxrdw4ffjoihto5y

Dopamine neurons learn to encode the long-term value of multiple future rewards

K. Enomoto, N. Matsumoto, S. Nakai, T. Satoh, T. K. Sato, Y. Ueda, H. Inokawa, M. Haruno, M. Kimura
2011 Proceedings of the National Academy of Sciences of the United States of America  
The dopamine responses were quantitatively predicted by theoretical descriptions of the value function with time discounting in reinforcement learning.  ...  If they play a critical role in achieving specific distant goals, long-term future rewards should also be encoded as suggested in reinforcement learning theories.  ...  This study was supported by Grant-in-Aid 17022032 for Scientific Research on Priority Areas (to M.K.) and by the Development of Biomarker Candidates for Social Behavior carried out under the Strategic  ... 
doi:10.1073/pnas.1014457108 pmid:21896766 pmcid:PMC3174584 fatcat:5yzye7fxnjfuvc5ppvjaegm74y

Rare Rewards Amplify Dopamine Learning Responses [article]

Kathryn M. Rothenhoefer, Tao Hong, Aydin Alikaya, William R. Stauffer
2019 bioRxiv   pre-print
For instance, reaction times and learning rates consistently reflect higher moments of probability distributions. Here, we demonstrate that dopamine RPE responses code probability distributions.  ...  These results demonstrate that dopamine responses reflect probability distributions and suggest a neural mechanism for the amplified learning and enhanced arousal associated with rare events.  ...  We used a reinforcement learning (RL) model to quantify the learned values (Supplementary Methods).  ... 
doi:10.1101/851709 fatcat:vvjo75fg2nbcxdhnd6l3gpoqya

Neuronal Reward and Decision Signals: From Theories to Data

Wolfram Schultz
2015 Physiological Reviews  
Utility is the formal mathematical characterization of subjective value and a prime decision variable in economic choice theory. It is coded as utility prediction error by phasic dopamine responses.  ...  Appropriate for formal decision mechanisms, rewards are coded as object value, action value, difference value, and chosen value by specific neurons.  ...  free and model-based reinforcement learning, respectively.  ... 
doi:10.1152/physrev.00023.2014 pmid:26109341 pmcid:PMC4491543 fatcat:io5urzxu7rcttkaciaapilmkwe

How Reward and Aversion Shape Motivation and Decision Making: A Computational Account

Jeroen P. H. Verharen, Roger A. H. Adan, Louk J. M. J. Vanderschuren
2019 The Neuroscientist  
In this review, we integrate historic findings on the behavioral and neural mechanisms of value-based decision making with recent, groundbreaking work in this area.  ...  On the basis of this integrated view, we discuss a neuroeconomic framework of value-based decision making, use this to explain the motivation to pursue rewards and how motivation relates to the costs and  ...  the lack of evidence for the OFC encoding value in a way that supports predictionerror based learning.  ... 
doi:10.1177/1073858419834517 pmid:30866712 fatcat:d5u5trr2tncnxinzilfg3mcf2m

DeepGraphMol, a multi-objective, computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approach [article]

Yash Khemchandani, Steve O'Hagan, Soumitra Samanta, Neil Swainston, Timothy J Roberts, Danushka Bollegala, Douglas B Kell
2020 bioRxiv   pre-print
Combinations of these terms, including drug likeness and synthetic accessibility, are then optimized using reinforcement learning based on a graph convolution policy approach.  ...  We extend our method successfully to use a multi-objective reward function, in this case for generating novel molecules that bind with dopamine transporters but not with those for norepinephrine.  ...  Reinforcement Learning setup Our model environment builds a molecule step by step with the addition of a new bond in each step.  ... 
doi:10.1101/2020.05.25.114165 fatcat:zfhmrsx6mzaibb4d626gkous5q

Beyond simple reinforcement learning: the computational neurobiology of reward-learning and valuation

John P. O'Doherty
2012 European Journal of Neuroscience  
Neural computational accounts of reward-learning have been dominated by the hypothesis that dopamine neurons behave like a reward-prediction error and thus facilitate reinforcement learning in striatal  ...  In this special issue of EJN we feature a combination of theoretical and experimental papers highlighting some of the explanatory challenges faced by simple reinforcement-learning models and describing  ...  a putative role for this brain region in model-based learning.  ... 
doi:10.1111/j.1460-9568.2012.08074.x pmid:22487029 fatcat:cqyjcvisjjhpvhiyeu5uzt2gia

Reinforcement learning, conditioning, and the brain: Successes and challenges

Tiago V. Maia
2009 Cognitive, Affective, & Behavioral Neuroscience  
Sometimes, such reinforcement is not deterministic even given the triplet s, a, s ; in those cases, R(s, a, s ) is the expected value of the distribution of reinforcements when the agent is in state s,  ...  coding.  ...  Adaptive coding by dopamine neurons. Dopamine neurons do not seem to code the value of prediction errors in absolute terms.  ... 
doi:10.3758/cabn.9.4.343 pmid:19897789 fatcat:ovlgmuwlljeenhkjvix7lswqzy

Stimulation of the vagus nerve reduces learning in a go/no-go reinforcement learning task: Supporting Information [article]

Anne Kuehnel, Vanessa Teckentrup, Monja P. Neuser, Quentin J. M. Huys, Caroline Burrasch, Martin Walter, Nils B Kroemer
2019 biorxiv/medrxiv   pre-print
These results highlight a novel role of vagal afferent input in modulating reinforcement learning by tuning the learning rate according to homeostatic needs.  ...  Computational reinforcement learning models identified the cause of this as a reduction in the learning rate through tVNS (∆α = -0.092, pboot = .002), particularly after punishment (∆αPun = -0.081, pboot  ...  Effects of the conditions were modeled by predicting if a given choice (Bernoulli distribution) was correct based on the regressors go (dummy coded), win (dummy coded), and the interaction term go ✕ win  ... 
doi:10.1101/535260 fatcat:ldc3mgyvirfuhgkoh2vzcsmtza

Reinforcement learning: Computational theory and biological mechanisms

Kenji Doya
2007 HFSP Journal  
Reinforcement learning is a computational framework for an active agent to learn behaviors on the basis of a scalar reward signal.  ...  The theory of reinforcement learning, which was developed in an artificial intelligence community with intuitions from animal learning theory, is now giving a coherent account on the function of the basal  ...  Based on the state representation in the cortex, the striatum learns state and action value functions.  ... 
doi:10.2976/1.2732246/10.2976/1 pmid:19404458 pmcid:PMC2645553 fatcat:k7o4vytl5vf53jy4cc2sf4xzfi

Reinforcement learning: Computational theory and biological mechanisms

Kenji Doya
2007 HFSP Journal  
Reinforcement learning is a computational framework for an active agent to learn behaviors on the basis of a scalar reward signal.  ...  The theory of reinforcement learning, which was developed in an artificial intelligence community with intuitions from animal learning theory, is now giving a coherent account on the function of the basal  ...  Based on the state representation in the cortex, the striatum learns state and action value functions.  ... 
doi:10.2976/1.2732246 pmid:19404458 pmcid:PMC2645553 fatcat:frdfeyvw4bdahdc6qphd4yslru

Dopamine, Inference, and Uncertainty [article]

Samuel J. Gershman
2017 bioRxiv   pre-print
A coherent explanation for these deviations can be obtained by analyzing the dopamine response in terms of Bayesian reinforcement learning.  ...  This account can explain dopamine responses to inferred value in sensory preconditioning, the effects of cue pre-exposure (latent inhibition) and adaptive coding of prediction errors when rewards vary  ...  Acknowledgments This research was supported by the NSF Collaborative Research in Computational Neuroscience (CRCNS) Program Grant IIS-120 7833.  ... 
doi:10.1101/149849 fatcat:3k5riglaxnea5ad433bwgelyp4
« Previous Showing results 1 — 15 out of 7,237 results