Filters








43 Hits in 10.2 sec

Hierarchical Skills for Efficient Exploration [article]

Jonas Gehring, Gabriel Synnaeve, Andreas Krause, Nicolas Usunier
2021 arXiv   pre-print
We alleviate the need for prior knowledge by proposing a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.  ...  In this work, we analyze this trade-off for low-level policy pre-training with a new benchmark suite of diverse, sparse-reward tasks for bipedal robots.  ...  Acknowledgements: We thank Alessandro Lazaric for insightful discussions, and Franziska Meier, Ludovic Denoyer, and Kevin Lu for helpful feedback on early versions of this paper.  ... 
arXiv:2110.10809v1 fatcat:26d3laqcrzc45k6omuoskbzife

Analysing Deep Reinforcement Learning Agents Trained with Domain Randomisation [article]

Tianhong Dai, Kai Arulkumaran, Tamara Gerbert, Samyakh Tukra, Feryal Behbahani, Anil Anthony Bharath
2020 arXiv   pre-print
are heavily influenced by the task setup and presence of additional proprioceptive inputs.  ...  Finally, we investigate the internals of the trained agents by using a suite of interpretability techniques.  ...  Proximal Policy Optimisation For our experiments we train our agents using proximal policy optimisation (PPO) (Schulman et al., 2017) , a widely used and performant RL algorithm. 1 Rather than training  ... 
arXiv:1912.08324v2 fatcat:zosixgcqgrajtf36bhx7fydhem

Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation [article]

Charles Sun, Jędrzej Orbik, Coline Devin, Brian Yang, Abhishek Gupta, Glen Berseth, Sergey Levine
2021 arXiv   pre-print
Our method employs a modularized policy with components for manipulation and navigation, where manipulation policy uncertainty drives exploration for the navigation controller, and the manipulation module  ...  We study how robots can autonomously learn skills that require a combination of navigation and grasping.  ...  We obtain the former by training an ensemble of grasping policies and using their uncertainty to efficiently explore grasping.  ... 
arXiv:2107.13545v3 fatcat:jwenctwrb5fhllvwreqbalhf7a

RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning [article]

Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Tom Le Paine, Sergio Gomez Colmenarejo, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li (+6 others)
2021 arXiv   pre-print
We propose detailed evaluation protocols for each domain in RL Unplugged and provide an extensive analysis of supervised learning and offline RL methods using these protocols.  ...  We hope that our suite of benchmarks will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more  ...  As the use of a different offline RL algorithm would likely result in different ordering, our objective was to only approximately cover games of varying difficulty in our offline policy selection tasks  ... 
arXiv:2006.13888v4 fatcat:whpwfgfbkveyfplq2smz5ovmo4

Reinforcement Learning in Practice: Opportunities and Challenges [article]

Yuxi Li
2022 arXiv   pre-print
Then we discuss opportunities of RL, in particular, products and services, games, bandits, recommender systems, robotics, transportation, finance and economics, healthcare, education, combinatorial optimization  ...  We conclude with a discussion, attempting to answer: "Why has RL not been widely adopted in practice yet?" and "When is RL helpful?".  ...  Hierarchical RL is an approach for issues of sparse rewards and/or long horizons, with exploration in the space of high-level goals.  ... 
arXiv:2202.11296v2 fatcat:xdtsmme22rfpfn6rgfotcspnhy

Learning, memory and consolidation mechanisms for behavioral control in hierarchically organized cortico‐basal ganglia systems

Silviu I. Rusu, Cyriel M. A. Pennartz
2019 Hippocampus  
sequences of planned, goal-directed behavior.  ...  The functioning of a connected set of brain structures-prefrontal cortex, hippocampus, striatum, and dopaminergic mesencephalon-is reviewed in relation to two important distinctions: (a) goal-directed  ...  Here, we emphasize that hierarchical behavioral control includes more than the (hierarchical organization of) classic RL alone, tied as this is to MFL.  ... 
doi:10.1002/hipo.23167 pmid:31617622 fatcat:elty4qd4enhz3kttul5scagyom

A Metaverse: taxonomy, components, applications, and open challenges

Sang-Min Park, Young-Gab Kim
2022 IEEE Access  
access to connectivity with reality using virtual currency.  ...  The integration of enhanced social activities and neural-net methods requires a new definition of Metaverse suitable for the present, different from the previous Metaverse.  ...  It includes an ensemble of multiple networks in a hierarchical tree structure that shares an intermediate layer.  ... 
doi:10.1109/access.2021.3140175 fatcat:fnraeaz74vh33knfvhzrynesli

Integrating reinforcement learning, equilibrium points, and minimum variance to understand the development of reaching: A computational model

Daniele Caligiore, Domenico Parisi, Gianluca Baldassarre
2014 Psychological review  
of the three mentioned hypotheses: the model first quickly learns to perform coarse movements that assure a contact of the hand with the target (an achievement with great adaptive value), and then slowly  ...  speed profile, and the evolution of the management of redundant degrees of freedom.  ...  The first outcome of the simulation is that the model brings the reward from 0.75 (average on the last 1000 trials  ... 
doi:10.1037/a0037016 pmid:25090425 fatcat:a4xsbizhufaive6qpjb6pyeryi

ToyArchitecture: Unsupervised learning of interpretable models of the environment

Jaroslav Vítků, Petr Dluhoš, Joseph Davidson, Matěj Nikl, Simon Andersson, Přemysl Paška, Jan Šinkora, Petr Hlubuček, Martin Stránský, Martin Hyben, Martin Poliak, Jan Feyereisl (+2 others)
2020 PLoS ONE  
This architecture incorporates the unsupervised learning of a model of the environment, learning the influence of one's own actions, model-based reinforcement learning, hierarchical planning, and symbolic  ...  The learned model is stored in the form of hierarchical representations which are increasingly more abstract, but can retain details when needed.  ...  Based on predictive modelling, it tries to learn how to store and retrieve representations in an unsupervised manner, which are then used in RL tasks.  ... 
doi:10.1371/journal.pone.0230432 pmid:32421693 fatcat:kl4kbezycrayfbgqp3pgt5xcau

Eye-movements as a signature of age-related differences in global planning strategies for spatial navigation [article]

Elisa M. Tartaglia, Celine Boucly, Guillaume Tatur, Angelo Arleo
2018 biorxiv/medrxiv   pre-print
We used reinforcement learning (RL) to corroborate that eye movements statistics was crucially subtending the decision making process involved in re-planning and that the incorporation of this additional  ...  Although a plethora of plausible computations have been put forward to elucidate how the brain accomplishes efficient goal-oriented navigation, the mechanisms that guide an effective re-planning when facing  ...  Moreover, the use of RL to fit behavioural data of (young) observers performing a navigation task has provided a clear indication of the strategies at use.  ... 
doi:10.1101/481788 fatcat:kff5msdv3zfbxn4j7xt52iw5ce

ToyArchitecture: Unsupervised Learning of Interpretable Models of the World [article]

Jaroslav Vítků, Petr Dluhoš, Joseph Davidson, Matěj Nikl, Simon Andersson, Přemysl Paška, Jan Šinkora, Petr Hlubuček, Martin Stránský, Martin Hyben, Martin Poliak, Jan Feyereisl (+1 others)
2019 arXiv   pre-print
This enables the architecture to learn efficiently using sub-symbolic methods and to employ symbolic inference.  ...  In this paper, we present a novel, purposely simple, and interpretable hierarchical architecture which combines multiple different mechanisms into one system: unsupervised learning of a model of the world  ...  Based on predictive modelling, it tries to learn how to store and retrieve representations in an unsupervised manner, which are then used in RL tasks.  ... 
arXiv:1903.08772v2 fatcat:wnknrw73pfhnpi6zy35pecriom

Towards an integration of deep learning and neuroscience [article]

Adam Marblestone, Greg Wayne, Konrad Kording
2016 arXiv   pre-print
In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively  ...  First, structured architectures are used, including dedicated systems for attention, recursion and various forms of short- and long-term memory storage.  ...  We thank Miles Brundage for an excellent Twitter feed of deep learning papers.  ... 
arXiv:1606.03813v1 fatcat:tmmholydqbcplbc5ihg76yip6e

Toward an Integration of Deep Learning and Neuroscience

Adam H. Marblestone, Greg Wayne, Konrad P. Kording
2016 Frontiers in Computational Neuroscience  
First, structured architectures are used, including dedicated systems for attention, recursion and various forms of short-and long-term memory storage.  ...  In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively  ...  Hierarchical RL relies on a hierarchical representation of state and action spaces, and it has been suggested that error-driven learning of an optimal such representation in the hippocampus 50 gives rise  ... 
doi:10.3389/fncom.2016.00094 pmid:27683554 pmcid:PMC5021692 fatcat:yikwc4h5yvfj7gwzlimtw5n6ai

Towards an integration of deep learning and neuroscience [article]

Adam Henry Marblestone, Greg Wayne, Konrad P Kording
2016 bioRxiv   pre-print
In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively  ...  First, structured architectures are used, including dedicated systems for attention, recursion and various forms of short- and long-term memory storage.  ...  Hierarchical RL relies on a hierarchical representation of state and action spaces, and it has been suggested that error-driven learning of an optimal such representation in the hippocampus 50 gives rise  ... 
doi:10.1101/058545 fatcat:4ryejpe2tnf7dgoaqhoastoiya

28th Annual Computational Neuroscience Meeting: CNS*2019

2019 BMC Neuroscience  
We have recently revealed the presence of dynamical invariants in the pyloric CPG in the form of cycle-by-cycle linear relations among specific time intervals and the instantaneous period [4].  ...  Some efforts use neurons with intrinsic rich dynamics as observed in several experimental works of CPGs [2, 3].  ...  Inspired by the FAIR data policy [1] , we propose a hierarchical data organization with copious amount of metadata, to help keep everything organized, easily shareable and traceable.  ... 
doi:10.1186/s12868-019-0538-0 fatcat:3pt5qvsh45awzbpwhqwbzrg4su
« Previous Showing results 1 — 15 out of 43 results