Filters








61,516 Hits in 4.7 sec

Option Discovery in the Absence of Rewards with Manifold Analysis [article]

Amitay Bar, Ronen Talmon, Ron Meir
2020 arXiv   pre-print
Options have been shown to be an effective tool in reinforcement learning, facilitating improved exploration and learning.  ...  Incorporating modes associated with higher graph frequencies unravels domain subtleties, which are shown to be useful for option discovery.  ...  The work of RM is partially supported by the Ollendorff Center of the Viterbi Faculty of Electrical Engineering at the Technion, and by the Skillman chair in biomedical sciences.  ... 
arXiv:2003.05878v2 fatcat:g2sgedwvyrgkpgbefus7jtyjsm

Learning to Interrupt: A Hierarchical Deep Reinforcement Learning Framework for Efficient Exploration [article]

Tingguang Li, Jin Pan, Delong Zhu, Max Q.-H. Meng
2018 arXiv   pre-print
While deep reinforcement learning achieves significant success recently, it is still extremely difficult to be deployed in real robots directly.  ...  In this paper, we propose a hybrid structure named Option-Interruption in which human knowledge is embedded into a hierarchical reinforcement learning framework.  ...  Recently, Deep Reinforcement Learning (DRL) has achieved significant success in various games [1] [2] and shows a promising future.  ... 
arXiv:1807.11150v1 fatcat:yxo4wc4mtzcbxa6llscppzfo6m

Playing Atari Ball Games with Hierarchical Reinforcement Learning [article]

Hua Huang, Adrian Barbu
2019 arXiv   pre-print
In this way humans can learn much faster compared with most current artificial intelligence algorithms in many tasks.  ...  We argue that these instructions have tremendous value in designing a reinforcement learning system which can learn in human fashion, and we test the idea by playing the Atari games Tennis and Pong.  ...  The State-Option-Reward-State-Option(λ) algorithm Compared with Q-learning, which is more popular in Deep Reinforcement Learning, SARSA is an on-policy method that is stable when combined with bootstrap  ... 
arXiv:1909.12465v1 fatcat:lungjcxzfjhz3ell3gpuenk3sq

Deep Reinforcement Learning for Dexterous Manipulation with Concept Networks [article]

Aditya Gudimella, Ross Story, Matineh Shaker, Ruofan Kong, Matthew Brown, Victor Shnayder, Marcos Campos
2017 arXiv   pre-print
With this hierarchical learning approach, deep reinforcement learning can be used to solve complex tasks in a modular way, through problem decomposition.  ...  Deep reinforcement learning yields great results for a large array of problems, but models are generally retrained anew for each new problem to be solved.  ...  Kulkarni et al. [2016] propose a scheme for temporal abstraction that involves simultaneously learning options and a control policy to compose options in a deep reinforcement learning framework.  ... 
arXiv:1709.06977v1 fatcat:4pqzm3g6c5dbzgvolphhag4bty

Temporal Abstraction in Reinforcement Learning with the Successor Representation [article]

Marlos C. Machado and Andre Barreto and Doina Precup
2021 arXiv   pre-print
In reinforcement learning, this is often modeled through temporally extended courses of actions called options.  ...  Nevertheless, approaches based on the options framework often start with the assumption that a reasonable set of options is known beforehand.  ...  Kulkarni et al. (2016) , in the context of deep reinforcement learning, also discovered bottleneck options through normalized cuts.  ... 
arXiv:2110.05740v1 fatcat:zrguhnljlvbyhlvqlir4cpfrx4

Discovering hierarchies using Imitation Learning from hierarchy aware policies [article]

Ameet Deshpande, Harshavardhan Kamarthi, Balaraman Ravindran
2019 arXiv   pre-print
Deep Discovery of Options (DDO) is a generative algorithm that learns a hierarchical policy along with options directly from expert trajectories.  ...  Learning options that allow agents to exhibit temporally higher order behavior has proven to be useful in increasing exploration, reducing sample complexity and for various transfer scenarios.  ...  Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013. Negin Nejati, Pat Langley, and Tolga Konik. Learning hierarchical task networks by observation.  ... 
arXiv:1812.00225v2 fatcat:ppdehoiivre5zd7sxy77gjs56q

Efficient Deep Reinforcement Learning via Adaptive Policy Transfer [article]

Tianpei Yang, Jianye Hao, Zhaopeng Meng, Zongzhang Zhang, Yujing Hu, Yingfeng Cheng, Changjie Fan, Weixun Wang, Wulong Liu, Zhaodong Wang, Jiajie Peng
2020 arXiv   pre-print
PTF can be easily combined with existing deep RL approaches.  ...  Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks.  ...  Introduction Recent advance in Deep Reinforcement Learning (DRL) has obtained expressive success of achieving human-level control in complex tasks [Mnih et al., 2015; Lillicrap et al., 2016] .  ... 
arXiv:2002.08037v3 fatcat:morngkz4hbgizowoqrsosw2yli

Model Learning for Look-ahead Exploration in Continuous Control [article]

Arpit Agarwal, Katharina Muelling, Katerina Fragkiadaki
2018 arXiv   pre-print
Our skills are multi-goal policies learned in isolation in simpler environments using existing multigoal RL formulations, analogous to options or macroactions.  ...  We propose an exploration method that incorporates look-ahead search over basic learnt skills and their dynamics, and use it for reinforcement learning (RL) of manipulation policies .  ...  We use learned skill dynamics with deep neural regressors and use them for look-ahead tree search, to guide effective exploration in reinforcement learning of complex manipulation tasks.  ... 
arXiv:1811.08086v1 fatcat:trbxm7n2tvbatksjjmh56bbm7i

Strategic Attentive Writer for Learning Macro-Actions [article]

Alexander Vezhnevets, Volodymyr Mnih, John Agapiou, Simon Osindero, Alex Graves, Oriol Vinyals, Koray Kavukcuoglu
2016 arXiv   pre-print
We present a novel deep recurrent neural network architecture that learns to build implicit plans in an end-to-end manner by purely interacting with an environment in reinforcement learning setting.  ...  These macro-actions enable both structured exploration and economic computation.  ...  In this game, agent explores a maze with a yellow avatar. It has to cover all of the maze territory without colliding with green figures (enemies).  ... 
arXiv:1606.04695v1 fatcat:476h3pvfyjh7fbz6bos7gu4sfa

Model Learning for Look-Ahead Exploration in Continuous Control

Arpit Agarwal, Katharina Muelling, Katerina Fragkiadaki
2019 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Our skills are multi-goal policies learned in isolation in simpler environments using existing multigoal RL formulations, analogous to options or macroactions.  ...  We propose an exploration method that incorporates lookahead search over basic learnt skills and their dynamics, and use it for reinforcement learning (RL) of manipulation policies.  ...  We use learned skill dynamics with deep neural regressors and use them for look-ahead tree search, to guide effective exploration in reinforcement learning of complex manipulation tasks. path that leads  ... 
doi:10.1609/aaai.v33i01.33013151 fatcat:ddmhmwojjnf33glnplzpc5sbpu

Applied Machine Learning for Games: A Graduate School Course [article]

Yilei Zeng, Aayush Shah, Jameson Thai, Michael Zyda
2021 arXiv   pre-print
In this paper, we describe our machine learning course designed for graduate students interested in applying recent advances of deep learning and reinforcement learning towards gaming.  ...  Student projects cover use-cases such as training AI-bots in gaming benchmark environments and competitions, understanding human decision patterns in gaming, and creating intelligent non-playable characters  ...  Leveraging the information provided in the YouTube videos, researchers can guide deep reinforcement learning explorations for games with sparse rewards (Aytar et al. 2018) .  ... 
arXiv:2012.01148v2 fatcat:f44ln32jnbfhrearv234ylteru

Policy search in continuous action domains: An overview

Olivier Sigaud, Freek Stulp
2019 Neural Networks  
Continuous action policy search is currently the focus of intensive research, driven both by the recent success of deep reinforcement learning algorithms and the emergence of competitors based on evolutionary  ...  In this paper, we present a broad survey of policy search methods, providing a unified perspective on very different approaches, including also Bayesian Optimization and directed exploration methods.  ...  Furthermore, recent attempts to combine BO with reinforcement learning approaches, giving rise to the Bayesian Optimization Reinforcement Learning (BORL) framework, are described in Section 5.  ... 
doi:10.1016/j.neunet.2019.01.011 fatcat:itzh3ogmgfdahomr3gjlduvkbq

Policy Search in Continuous Action Domains: an Overview [article]

Olivier Sigaud, Freek Stulp
2019 arXiv   pre-print
Continuous action policy search is currently the focus of intensive research, driven both by the recent success of deep reinforcement learning algorithms and the emergence of competitors based on evolutionary  ...  In this paper, we present a broad survey of policy search methods, providing a unified perspective on very different approaches, including also Bayesian Optimization and directed exploration methods.  ...  Furthermore, recent attempts to combine BO with reinforcement learning approaches, giving rise to the Bayesian Optimization Reinforcement Learning (BORL) framework, are described in Section 5.  ... 
arXiv:1803.04706v5 fatcat:llh4j5js5reopegwduelivxxm4

Exploring applications of deep reinforcement learning for real-world autonomous driving systems [article]

Victor Talpaert, Ibrahim Sobh, B Ravi Kiran, Patrick Mannion, Senthil Yogamani, Ahmad El-Sallab, Patrick Perez
2019 arXiv   pre-print
Deep Reinforcement Learning (DRL) has become increasingly powerful in recent years, with notable achievements such as Deepmind's AlphaGo.  ...  We first provide an overview of the tasks in autonomous driving systems, reinforcement learning algorithms and applications of DRL to AD systems.  ...  Exploration Issues with Imitation In some cases, demonstrations from experts are not available or even not covering the state space leading to learning a poor policy.  ... 
arXiv:1901.01536v3 fatcat:y3gck5rznjglvim4gem5dvb2ue

Deep Reinforcement Learning [article]

Yuxi Li
2018 arXiv   pre-print
We discuss deep reinforcement learning in an overview style. We draw a big picture, filled with details.  ...  We start with background of artificial intelligence, machine learning, deep learning, and reinforcement learning (RL), with resources.  ...  Lanctot et al. (2017) observe that independent RL, in which each agent learns by interacting with the environment, oblivious to other agents, can overfit the learned policies to other agents' policies  ... 
arXiv:1810.06339v1 fatcat:kp7atz5pdbeqta352e6b3nmuhy
« Previous Showing results 1 — 15 out of 61,516 results