Filters








6,742 Hits in 2.9 sec

Temporal Abstraction in Reinforcement Learning with the Successor Representation [article]

Marlos C. Machado and Andre Barreto and Doina Precup
2021 arXiv   pre-print
In reinforcement learning, this is often modeled through temporally extended courses of actions called options.  ...  use of temporal abstractions.  ...  The authors would like to thank Tom Schaul and Adam White for their thorough feedback on an earlier draft; and Dale Schuurmans, Yuu Jinnai, Marc G.  ... 
arXiv:2110.05740v1 fatcat:zrguhnljlvbyhlvqlir4cpfrx4

The Successor Representation: Its Computational Logic and Neural Substrates

Samuel J. Gershman
2018 Journal of Neuroscience  
Following this logic leads to the idea of the successor representation, which encodes states of the environment in terms of their predictive relationships with other states.  ...  Reinforcement learning is the process by which an agent learns to predict long-term future reward.  ...  Notice that unlike the temporal difference error for value learning, the temporal difference error for SR learning is vector-valued, with one error for each successor state.  ... 
doi:10.1523/jneurosci.0151-18.2018 pmid:30006364 pmcid:PMC6096039 fatcat:bz4mz2euxjfj7e3omvkqylluue

Predicting the future with multi-scale successor representations [article]

Ida Momennejad, Marc W. Howard
2018 bioRxiv   pre-print
The successor representation (SR) is a candidate principle for generalization in reinforcement learning, computational accounts of memory, and the structure of neural representations in the hippocampus  ...  However, SR with a single scale could discard information for predicting both the sequential order of and the distance between states, which are common problems in navigation for animals and artificial  ...  This work was funded by the John Templeton Foundation (IM),NIBIB R01EB022864, and ONR MURI N00014-16-1-2832 (MWH).  ... 
doi:10.1101/449470 fatcat:p66qw4imbjadlpw5ur3m6so6yu

The Successor Representation and Temporal Context

Samuel J. Gershman, Christopher D. Moore, Michael T. Todd, Kenneth A. Norman, Per B. Sederberg
2012 Neural Computation  
The successor representation was introduced into reinforcement learning by Dayan (1993) as a means of facilitating generalization between states with similar successors.  ...  Although reinforcement learning in general has been used extensively as a model of psychological and neural processes, the psychological validity of the successor representation has yet to be explored.  ...  Acknowledgements Respective contributions: The link between TCM and the successor representation was first worked out by CDM and PBS, in consultation with SJG, MTT, and KAN; the paper was primarily written  ... 
doi:10.1162/neco_a_00282 pmid:22364500 fatcat:tbncufrahnh5phszrgdrdukdfq

Dynamics-aware Embeddings [article]

William Whitney, Rajat Agarwal, Kyunghyun Cho, Abhinav Gupta
2020 arXiv   pre-print
In this paper we consider self-supervised representation learning to improve sample efficiency in reinforcement learning (RL).  ...  These embeddings capture the structure of the environment's dynamics, enabling efficient policy learning.  ...  Similarly to this work, hierarchical reinforcement learning seeks to learn temporal abstractions.  ... 
arXiv:1908.09357v3 fatcat:2qy6l4hb4fasbejvevkkfnpywi

Deep Successor Reinforcement Learning [article]

Tejas D. Kulkarni, Ardavan Saeedi, Simanta Gautam, Samuel J. Gershman
2016 arXiv   pre-print
Learning robust value functions given raw observations and rewards is now possible with model-free and model-based deep reinforcement learning algorithms.  ...  In this paper, we present DSR, which generalizes SR within an end-to-end deep reinforcement learning framework.  ...  Hierarchical reinforcement learning algorithms [1] such as the options framework [38, 39] provide a flexible framework to create temporal abstractions, which will enable exploration at different time-scales  ... 
arXiv:1606.02396v1 fatcat:st7mhic7azgebcdr75ahac5kpi

Successor Options: An Option Discovery Framework for Reinforcement Learning [article]

Rahul Ramesh, Manan Tomar, Balaraman Ravindran
2019 arXiv   pre-print
The options framework in reinforcement learning models the notion of a skill or a temporally extended sequence of actions.  ...  In this work, we propose Successor Options, which leverages Successor Representations to build a model of the state space.  ...  Conclusion Successor Options is an option discovery framework that leverages Successor Representations to build options.  ... 
arXiv:1905.05731v1 fatcat:h7zwlqb6v5bprc4bfnijhf5lzy

The growth and form of knowledge networks by kinesthetic curiosity [article]

Dale Zhou, David M. Lydon-Staley, Perry Zurn, Danielle S. Bassett
2020 arXiv   pre-print
The kinesthetic model of curiosity offers a vibrant counterpart to the deliberative predictions of model-based reinforcement learning.  ...  The practice of curiosity can be viewed as an extended and open-ended search for valuable information with hidden identity and location in a complex space of interconnected information.  ...  MacArthur Foundation, the Alfred P. Sloan Foundation, the ISI Foundation, the Paul Allen Foundation, the Army Research Laboratory (  ... 
arXiv:2006.02949v1 fatcat:z64rwtczqbfhtatajvvdn47dp4

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation [article]

Tejas D. Kulkarni, Karthik R. Narasimhan, Ardavan Saeedi, Joshua B. Tenenbaum
2016 arXiv   pre-print
Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms.  ...  We present hierarchical-DQN (h-DQN), a framework to integrate hierarchical value functions, operating at different temporal scales, with intrinsically motivated deep reinforcement learning.  ...  We are grateful to receive support from the Center for Brain, Machines and Minds (NSF STC award CCF -1231216) and the MIT OpenMind team.  ... 
arXiv:1604.06057v2 fatcat:p33suojusrcpfpg4ybc4hrfj6y

Reinforcement Learning and its Connections with Neuroscience and Psychology [article]

Ajay Subramanian, Sharad Chitlangia, Veeky Baths
2021 arXiv   pre-print
making in the brain.  ...  These algorithms have outperformed humans in several tasks by learning from scratch, using only scalar rewards obtained through interaction with their environment.  ...  Akam and Walton [122] recently proposed that successor representations are a key module in how model-based reinforcement learning systems are coupled with model-free reinforcement learning systems.  ... 
arXiv:2007.01099v5 fatcat:mjpkztlmqnfjba3dtcwqwmmlvu

A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation [article]

Scott Fujimoto, David Meger, Doina Precup
2021 arXiv   pre-print
We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy.  ...  The successor representation can be trained through deep reinforcement learning methodology and decouples the reward optimization from the dynamics of the environment, making the resulting algorithm stable  ...  This research was enabled in part by support provided by Calcul Québec and Compute Canada.  ... 
arXiv:2106.06854v1 fatcat:4z7eaqfh6rfkjao3z2qip34n4i

From pixels to policies: A bootstrapping agent

Jeremy Stober, Benjamin Kuipers
2008 2008 7th IEEE International Conference on Development and Learning  
These derived variables are abstracted to discrete qualitative variables, which serve as features for temporal difference learning.  ...  We describe an approach by which an autonomous learning agent can bootstrap its way from pixel-level interaction with the world, to individuating and tracking objects in the environment, to learning an  ...  Solving reinforcement learning problems through temporal difference methods are also explored extensively in the literature [9] , [10] .  ... 
doi:10.1109/devlrn.2008.4640813 fatcat:nukv3fw5cbdhjnmqcjpdv4aope

Reinforcement-guided learning in frontal neocortex: emerging computational concepts

Abhishek Banerjee, Rajeev V Rikhye, Adam Marblestone
2021 Current Opinion in Behavioral Sciences  
This temporal difference framework, however, even when augmented with deep credit assignment, does not fully capture higher-order processes such as the influence of goal representations, planning based  ...  The classical concepts of reinforcement learning in the mammalian brain focus on dopamine release in the basal ganglia as the neural substrate of reward prediction errors, which drive plasticity in striatal  ...  Haydock in reading the manuscript. The authors thank Drs. Fritjof Helmchen and Christopher Lewis for their comments on an earlier version of the manuscript.  ... 
doi:10.1016/j.cobeha.2021.02.019 fatcat:e23xwqtibzgidcuozvfdm47dzy

Abstraction and Generalization in Reinforcement Learning: A Summary and Framework [chapter]

Marc Ponsen, Matthew E. Taylor, Karl Tuyls
2010 Lecture Notes in Computer Science  
In this paper we survey the basics of reinforcement learning, generalization and abstraction.  ...  We start with an introduction to the fundamentals of reinforcement learning and motivate the necessity for generalization and abstraction.  ...  Marc Ponsen is sponsored by the Interactive Collaborative Information Systems (ICIS) project, supported by the Dutch Ministry of Economic Affairs, grant nr: BSIK03024.  ... 
doi:10.1007/978-3-642-11814-2_1 fatcat:vovcchfhfrcpfat6evfefc54em

Predictive representations can link model-based reinforcement learning to model-free mechanisms

Evan M. Russek, Ida Momennejad, Matthew M. Botvinick, Samuel J. Gershman, Nathaniel D. Daw, Jean Daunizeau
2017 PLoS Computational Biology  
The foundation of this framework is the successor representation, a predictive state representation that, when combined with TD learning of value predictions, can produce a subset of the behaviors associated  ...  behavior are structurally homologous to and overlapping with those thought to carry out model-free temporal difference (TD) learning.  ...  Also, recent approaches to reinforcement learning in the brain have advocated for a hierarchical approach in which punctate actions are supplemented by temporally abstract policies [111] .  ... 
doi:10.1371/journal.pcbi.1005768 pmid:28945743 pmcid:PMC5628940 fatcat:p35rio5a5bbv7dzfbqyru5vdxe
« Previous Showing results 1 — 15 out of 6,742 results