Filters








189,918 Hits in 5.0 sec

Deep Q-learning From Demonstrations

Todd Hester, Matej Vecerik, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Ian Osband, Gabriel Dulac-Arnold, John Agapiou (+2 others)
2018 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
We present an algorithm, Deep Q-learning from Demonstrations (DQfD), that leverages small sets of demonstration data to massively accelerate the learning process even from relatively small amounts of demonstration  ...  Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems.  ...  We propose a new deep reinforcement learning algorithm, Deep Q-learning from Demonstrations (DQfD), which leverages even very small amounts of demonstration data to massively accelerate learning.  ... 
doi:10.1609/aaai.v32i1.11757 fatcat:2czgsd626ne4bhsouack3dn6fm

Deep Q-learning from Demonstrations [article]

Todd Hester, Matej Vecerik, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John Agapiou (+2 others)
2017 arXiv   pre-print
We present an algorithm, Deep Q-learning from Demonstrations (DQfD), that leverages small sets of demonstration data to massively accelerate the learning process even from relatively small amounts of demonstration  ...  Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems.  ...  We propose a new deep reinforcement learning algorithm, Deep Q-learning from Demonstrations (DQfD), which leverages even very small amounts of demonstration data to massively accelerate learning.  ... 
arXiv:1704.03732v4 fatcat:aojkn6wbozc6xlcfdfylqsfr6y

Deep reward shaping from demonstrations

Ahmed Hussein, Eyad Elyan, Mohamed Medhat Gaber, Chrisina Jayne
2017 2017 International Joint Conference on Neural Networks (IJCNN)  
METHOD This section presents the proposed method for deep reward shaping from demonstrations. The method uses a deep supervised network to learn a shaping function from demonstration.  ...  With the rising interest in deep reinforcement learning, some efforts explore incorporating learning from demonstrations with RL in a deep learning context.  ... 
doi:10.1109/ijcnn.2017.7965896 dblp:conf/ijcnn/HusseinEGJ17 fatcat:65fljt6wfbbzjjj5smmm2vzj5y

Pre-training Neural Networks with Human Demonstrations for Deep Reinforcement Learning [article]

Gabriel V. de la Cruz Jr, Yunshu Du, Matthew E. Taylor
2019 arXiv   pre-print
Deep reinforcement learning (deep RL) has achieved superior performance in complex sequential tasks by using a deep neural network as its function approximator and by learning directly from raw images.  ...  A drawback of using raw images is that deep RL must learn the state feature representation from the raw images in addition to learning a policy.  ...  Deep Q-network The first successful deep RL method, deep Q-network (DQN), learns to play 49 Atari games directly from screen pixels by combining Q-learning with a deep convolutional neural network [16  ... 
arXiv:1709.04083v2 fatcat:jm75zoaffncrzejzpp247fno5u

Hierarchical Deep Q-Network from Imperfect Demonstrations in Minecraft [article]

Alexey Skrynnik, Aleksey Staroverov, Ermek Aitygulov, Kirill Aksenov, Vasilii Davydov, Aleksandr I. Panov
2020 arXiv   pre-print
We present Hierarchical Deep Q-Network (HDQfD) that took first place in the MineRL competition. HDQfD works on imperfect demonstrations and utilizes the hierarchical structure of expert trajectories.  ...  We introduce the procedure of extracting an effective sequence of meta-actions and subgoals from demonstration data.  ...  This hierarchical Deep Q-Network from Demonstrations won first place in the MineRL competition and received 61.61 score.  ... 
arXiv:1912.08664v4 fatcat:c7t67u2vxzgmjcqxmtwds4ve3i

Active Deep Q-learning with Demonstration [article]

Si-An Chen, Voot Tangkaratt, Hsuan-Tien Lin, Masashi Sugiyama
2018 arXiv   pre-print
Recent research has shown that although Reinforcement Learning (RL) can benefit from expert demonstration, it usually takes considerable efforts to obtain enough demonstration.  ...  Under the framework, we propose Active Deep Q-Network, a novel query strategy which adapts to the dynamically-changing distributions during the RL training process by estimating the uncertainty of recent  ...  Deep Q-learning from Demonstration Deep Q-learning from Demonstration (DQfD (Hester et al., 2018) ) is a state-ofthe-art method to leverage demonstration data to accelerate the learning process of DQN  ... 
arXiv:1812.02632v1 fatcat:b6qnwltqtjbtpc7whmq72wvboe

Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates [article]

Shixiang Gu and Ethan Holly and Timothy Lillicrap and Sergey Levine
2016 arXiv   pre-print
In this paper, we demonstrate that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network  ...  Deep reinforcement learning alleviates this limitation by training general-purpose neural network policies, but applications of direct deep reinforcement learning algorithms have so far been restricted  ...  We demonstrate that, contrary to commonly held assumptions, recently developed off-policy deep Q-function based algorithms such as the Deep Deterministic Policy Gradient * equal contribution, 1 Google  ... 
arXiv:1610.00633v2 fatcat:i3nllxmobvhy5dmnqdzmhtelqa

Deep reinforcement learning for optical systems: A case study of mode-locked lasers [article]

Chang Sun, Eurika Kaiser, Steven L. Brunton, J. Nathan Kutz
2020 arXiv   pre-print
We demonstrate that deep reinforcement learning (deep RL) provides a highly effective strategy for the control and self-tuning of optical systems.  ...  Deep RL integrates the two leading machine learning architectures of deep neural networks and reinforcement learning to produce robust and stable learning for control.  ...  RESULTS We demonstrate the efficacy of deep reinforcement learning control on mode-locked fiber lasers in Fig. 1 . We first demonstrate the deep RL strategy for a single-input control (α 1 ).  ... 
arXiv:2006.05579v1 fatcat:xqxghch6rre55dkgnf6wfjicjq

Deep imitation learning for 3D navigation tasks

Ahmed Hussein, Eyad Elyan, Mohamed Medhat Gaber, Chrisina Jayne
2017 Neural computing & applications (Print)  
In this paper, we propose a deep imitation learning method to learn navigation tasks from demonstrations in a 3D environment.  ...  This approach is compared to two popular deep reinforcement learning techniques: deep-Q-networks and Asynchronous actor-critic (A3C).  ...  The proposed learning from demonstration method is compared to two popular deep reinforcement learning methods: deep-Q-networks (DQN) which has shown human level behavior on learning Atari games from raw  ... 
doi:10.1007/s00521-017-3241-z pmid:29576690 pmcid:PMC5857289 fatcat:k4falirfe5dn7jmzet74vmmbpa

Deep Inverse Q-learning with Constraints [article]

Gabriel Kalweit, Maria Huegle, Moritz Werling, Joschka Boedecker
2020 arXiv   pre-print
We evaluate the resulting algorithms called Inverse Action-value Iteration, Inverse Q-learning and Deep Inverse Q-learning on the Objectworld benchmark, showing a speedup of up to several orders of magnitude  ...  This is possible through a formulation that exploits a probabilistic behavior assumption for the demonstrations within the structure of Q-learning.  ...  High-level Decision Making for Autonomous Driving We apply Deep Constrained Inverse Q-Learning (DCIQL) to learn autonomous lane-changes on highways from demonstrations.  ... 
arXiv:2008.01712v1 fatcat:jta6rr7a5ramrdosqtvszfa5du

Performance Enhancement of Deep Reinforcement Learning Networks Using Feature Extraction [chapter]

Joaquin Ollero, Christopher Child
2018 Lecture Notes in Computer Science  
The results show that the extraction of features from the hidden layers of the Deep Q-Network improves the learning process of the agent (4.58 times faster and better) and proves the existence of encoded  ...  The combination of Deep Learning and Reinforcement Learning, termed Deep Reinforcement Learning Networks (DRLN), offers the possibility of using a Deep Learning Neural Network to produce an approximate  ...  Lastly, by using features extracted from the last hidden layer of a specific Deep Q-Network, we demonstrate that these can be used to predict the best actions to take from each state to reach the goal  ... 
doi:10.1007/978-3-319-92537-0_25 fatcat:ha4jjvn3ljg2few2rwkb3ldvci

Active deep Q-learning with demonstration

Si-An Chen, Voot Tangkaratt, Hsuan-Tien Lin, Masashi Sugiyama
2019 Machine Learning  
Under the framework, we propose Active deep Q-Network, a novel query strategy based on a classical RL algorithm called deep Q-network (DQN).  ...  In this work, we propose Active Reinforcement Learning with Demonstration, a new framework to streamline RL in terms of demonstration efforts by allowing the RL agent to query for demonstration actively  ...  Deep Q-learning from demonstration Deep Q-learning from Demonstration (DQfD; Hester et al. 2018 ) is a state-of-the-art method to leverage demonstration data to accelerate the learning process of DQN.  ... 
doi:10.1007/s10994-019-05849-4 fatcat:izql4tk7zfgedf2v3u5ajgx6s4

Deep Movement Primitives: toward Breast Cancer Examination Robot [article]

Oluwatoyin Sanni, Giorgio Bonvicini, Muhammad Arshad Khan, Pablo C. Lopez-Custodio, Kiyanoush Nazari, Amir M. Ghalamzan E.
2022 arXiv   pre-print
Robot learning from demonstrations (LfD) reduces the programming time and cost.  ...  This paper presents a novel approach to manipulation path/trajectory planning called deep Movement Primitives that successfully generates the movements of a manipulator to reach a breast phantom and perform  ...  We propose a novel Learning from Demonstration (LfD) approach called deep movement primitives (deep-MP) 1 directly mapping the visual sensory information into the learned trajectory.  ... 
arXiv:2202.09265v1 fatcat:kft2cf4225gjzbporjy6patntq

Hierarchically Robust Representation Learning

Qi Qian, Juhua Hu, Hao Li
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
With the tremendous success of deep learning in visual tasks, the representations extracted from intermediate layers of learned models, that is, deep features, attract much attention of researchers.  ...  In this work, we investigate this phenomenon and demonstrate that deep features can be suboptimal due to the fact that they are learned by minimizing the empirical risk.  ...  It demonstrates that the deep features learned with the proposed algorithm is more robust than those from ERM when the distribution of concepts varies.  ... 
doi:10.1109/cvpr42600.2020.00736 dblp:conf/cvpr/QianHL20 fatcat:fzy7yohty5h2jklikatobxf3g4

Active Task-Inference-Guided Deep Inverse Reinforcement Learning [article]

Farzan Memarian, Zhe Xu, Bo Wu, Min Wen, Ufuk Topcu
2020 arXiv   pre-print
The module then proceeds to learn a reward function over the augmented state space using a novel deep maximum entropy IRL algorithm.  ...  Given a Markov decision process (MDP) and a set of demonstrations for a task, IRL learns a reward function that assigns a real-valued reward to each state of the MDP.  ...  Although previous deep MaxEnt IRL methods [4] , [34] - [36] can learn deep reward networks from demonstrations, they assume the reward function is a function of the current MDP state.  ... 
arXiv:2001.09227v3 fatcat:qkall4h7d5ej7mbig5itawlitu
« Previous Showing results 1 — 15 out of 189,918 results