A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Deep Q-learning From Demonstrations
2018
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
We present an algorithm, Deep Q-learning from Demonstrations (DQfD), that leverages small sets of demonstration data to massively accelerate the learning process even from relatively small amounts of demonstration ...
Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems. ...
We propose a new deep reinforcement learning algorithm, Deep Q-learning from Demonstrations (DQfD), which leverages even very small amounts of demonstration data to massively accelerate learning. ...
doi:10.1609/aaai.v32i1.11757
fatcat:2czgsd626ne4bhsouack3dn6fm
Deep Q-learning from Demonstrations
[article]
2017
arXiv
pre-print
We present an algorithm, Deep Q-learning from Demonstrations (DQfD), that leverages small sets of demonstration data to massively accelerate the learning process even from relatively small amounts of demonstration ...
Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems. ...
We propose a new deep reinforcement learning algorithm, Deep Q-learning from Demonstrations (DQfD), which leverages even very small amounts of demonstration data to massively accelerate learning. ...
arXiv:1704.03732v4
fatcat:aojkn6wbozc6xlcfdfylqsfr6y
Deep reward shaping from demonstrations
2017
2017 International Joint Conference on Neural Networks (IJCNN)
METHOD This section presents the proposed method for deep reward shaping from demonstrations. The method uses a deep supervised network to learn a shaping function from demonstration. ...
With the rising interest in deep reinforcement learning, some efforts explore incorporating learning from demonstrations with RL in a deep learning context. ...
doi:10.1109/ijcnn.2017.7965896
dblp:conf/ijcnn/HusseinEGJ17
fatcat:65fljt6wfbbzjjj5smmm2vzj5y
Pre-training Neural Networks with Human Demonstrations for Deep Reinforcement Learning
[article]
2019
arXiv
pre-print
Deep reinforcement learning (deep RL) has achieved superior performance in complex sequential tasks by using a deep neural network as its function approximator and by learning directly from raw images. ...
A drawback of using raw images is that deep RL must learn the state feature representation from the raw images in addition to learning a policy. ...
Deep Q-network The first successful deep RL method, deep Q-network (DQN), learns to play 49 Atari games directly from screen pixels by combining Q-learning with a deep convolutional neural network [16 ...
arXiv:1709.04083v2
fatcat:jm75zoaffncrzejzpp247fno5u
Hierarchical Deep Q-Network from Imperfect Demonstrations in Minecraft
[article]
2020
arXiv
pre-print
We present Hierarchical Deep Q-Network (HDQfD) that took first place in the MineRL competition. HDQfD works on imperfect demonstrations and utilizes the hierarchical structure of expert trajectories. ...
We introduce the procedure of extracting an effective sequence of meta-actions and subgoals from demonstration data. ...
This hierarchical Deep Q-Network from Demonstrations won first place in the MineRL competition and received 61.61 score. ...
arXiv:1912.08664v4
fatcat:c7t67u2vxzgmjcqxmtwds4ve3i
Active Deep Q-learning with Demonstration
[article]
2018
arXiv
pre-print
Recent research has shown that although Reinforcement Learning (RL) can benefit from expert demonstration, it usually takes considerable efforts to obtain enough demonstration. ...
Under the framework, we propose Active Deep Q-Network, a novel query strategy which adapts to the dynamically-changing distributions during the RL training process by estimating the uncertainty of recent ...
Deep Q-learning from Demonstration Deep Q-learning from Demonstration (DQfD (Hester et al., 2018) ) is a state-ofthe-art method to leverage demonstration data to accelerate the learning process of DQN ...
arXiv:1812.02632v1
fatcat:b6qnwltqtjbtpc7whmq72wvboe
Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates
[article]
2016
arXiv
pre-print
In this paper, we demonstrate that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network ...
Deep reinforcement learning alleviates this limitation by training general-purpose neural network policies, but applications of direct deep reinforcement learning algorithms have so far been restricted ...
We demonstrate that, contrary to commonly held assumptions, recently developed off-policy deep Q-function based algorithms such as the Deep Deterministic Policy Gradient * equal contribution, 1 Google ...
arXiv:1610.00633v2
fatcat:i3nllxmobvhy5dmnqdzmhtelqa
Deep reinforcement learning for optical systems: A case study of mode-locked lasers
[article]
2020
arXiv
pre-print
We demonstrate that deep reinforcement learning (deep RL) provides a highly effective strategy for the control and self-tuning of optical systems. ...
Deep RL integrates the two leading machine learning architectures of deep neural networks and reinforcement learning to produce robust and stable learning for control. ...
RESULTS We demonstrate the efficacy of deep reinforcement learning control on mode-locked fiber lasers in Fig. 1 . We first demonstrate the deep RL strategy for a single-input control (α 1 ). ...
arXiv:2006.05579v1
fatcat:xqxghch6rre55dkgnf6wfjicjq
Deep imitation learning for 3D navigation tasks
2017
Neural computing & applications (Print)
In this paper, we propose a deep imitation learning method to learn navigation tasks from demonstrations in a 3D environment. ...
This approach is compared to two popular deep reinforcement learning techniques: deep-Q-networks and Asynchronous actor-critic (A3C). ...
The proposed learning from demonstration method is compared to two popular deep reinforcement learning methods: deep-Q-networks (DQN) which has shown human level behavior on learning Atari games from raw ...
doi:10.1007/s00521-017-3241-z
pmid:29576690
pmcid:PMC5857289
fatcat:k4falirfe5dn7jmzet74vmmbpa
Deep Inverse Q-learning with Constraints
[article]
2020
arXiv
pre-print
We evaluate the resulting algorithms called Inverse Action-value Iteration, Inverse Q-learning and Deep Inverse Q-learning on the Objectworld benchmark, showing a speedup of up to several orders of magnitude ...
This is possible through a formulation that exploits a probabilistic behavior assumption for the demonstrations within the structure of Q-learning. ...
High-level Decision Making for Autonomous Driving We apply Deep Constrained Inverse Q-Learning (DCIQL) to learn autonomous lane-changes on highways from demonstrations. ...
arXiv:2008.01712v1
fatcat:jta6rr7a5ramrdosqtvszfa5du
Performance Enhancement of Deep Reinforcement Learning Networks Using Feature Extraction
[chapter]
2018
Lecture Notes in Computer Science
The results show that the extraction of features from the hidden layers of the Deep Q-Network improves the learning process of the agent (4.58 times faster and better) and proves the existence of encoded ...
The combination of Deep Learning and Reinforcement Learning, termed Deep Reinforcement Learning Networks (DRLN), offers the possibility of using a Deep Learning Neural Network to produce an approximate ...
Lastly, by using features extracted from the last hidden layer of a specific Deep Q-Network, we demonstrate that these can be used to predict the best actions to take from each state to reach the goal ...
doi:10.1007/978-3-319-92537-0_25
fatcat:ha4jjvn3ljg2few2rwkb3ldvci
Active deep Q-learning with demonstration
2019
Machine Learning
Under the framework, we propose Active deep Q-Network, a novel query strategy based on a classical RL algorithm called deep Q-network (DQN). ...
In this work, we propose Active Reinforcement Learning with Demonstration, a new framework to streamline RL in terms of demonstration efforts by allowing the RL agent to query for demonstration actively ...
Deep Q-learning from demonstration Deep Q-learning from Demonstration (DQfD; Hester et al. 2018 ) is a state-of-the-art method to leverage demonstration data to accelerate the learning process of DQN. ...
doi:10.1007/s10994-019-05849-4
fatcat:izql4tk7zfgedf2v3u5ajgx6s4
Deep Movement Primitives: toward Breast Cancer Examination Robot
[article]
2022
arXiv
pre-print
Robot learning from demonstrations (LfD) reduces the programming time and cost. ...
This paper presents a novel approach to manipulation path/trajectory planning called deep Movement Primitives that successfully generates the movements of a manipulator to reach a breast phantom and perform ...
We propose a novel Learning from Demonstration (LfD) approach called deep movement primitives (deep-MP) 1 directly mapping the visual sensory information into the learned trajectory. ...
arXiv:2202.09265v1
fatcat:kft2cf4225gjzbporjy6patntq
Hierarchically Robust Representation Learning
2020
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
With the tremendous success of deep learning in visual tasks, the representations extracted from intermediate layers of learned models, that is, deep features, attract much attention of researchers. ...
In this work, we investigate this phenomenon and demonstrate that deep features can be suboptimal due to the fact that they are learned by minimizing the empirical risk. ...
It demonstrates that the deep features learned with the proposed algorithm is more robust than those from ERM when the distribution of concepts varies. ...
doi:10.1109/cvpr42600.2020.00736
dblp:conf/cvpr/QianHL20
fatcat:fzy7yohty5h2jklikatobxf3g4
Active Task-Inference-Guided Deep Inverse Reinforcement Learning
[article]
2020
arXiv
pre-print
The module then proceeds to learn a reward function over the augmented state space using a novel deep maximum entropy IRL algorithm. ...
Given a Markov decision process (MDP) and a set of demonstrations for a task, IRL learns a reward function that assigns a real-valued reward to each state of the MDP. ...
Although previous deep MaxEnt IRL methods [4] , [34] - [36] can learn deep reward networks from demonstrations, they assume the reward function is a function of the current MDP state. ...
arXiv:2001.09227v3
fatcat:qkall4h7d5ej7mbig5itawlitu
« Previous
Showing results 1 — 15 out of 189,918 results