Filters








22,119 Hits in 6.1 sec

Learning Deep Neural Network Policies with Continuous Memory States [article]

Marvin Zhang, Zoe McCarthy, Chelsea Finn, Sergey Levine, Pieter Abbeel
2015 arXiv   pre-print
Our approach consists of augmenting the state and action space of the system with continuous-valued memory states that the policy can read from and write to.  ...  In this paper, we present a method for learning policies with internal memory for high-dimensional, continuous systems, such as robotic manipulators.  ...  However, when viewed together with the memory states, the policy is endowed with memory, and can be regarded as a recurrent neural network.  ... 
arXiv:1507.01273v2 fatcat:e4rkh3is3vejzdrigojzihvtbe

Learning deep neural network policies with continuous memory states

Marvin Zhang, Zoe McCarthy, Chelsea Finn, Sergey Levine, Pieter Abbeel
2016 2016 IEEE International Conference on Robotics and Automation (ICRA)  
Our approach consists of augmenting the state and action space of the system with continuous-valued memory states that the policy can read from and write to.  ...  In this paper, we present a method for learning policies with internal memory for high-dimensional, continuous systems, such as robotic manipulators.  ...  However, when viewed together with the memory states, the policy is endowed with memory, and can be regarded as a recurrent neural network.  ... 
doi:10.1109/icra.2016.7487174 dblp:conf/icra/ZhangMFLA16 fatcat:7itfqdzoqbgd5nuyszs3tscxri

A Survey of Domain-Specific Architectures for Reinforcement Learning

Marc Rothmann, Mario Porrmann
2022 IEEE Access  
Both tabular and deep reinforcement learning algorithms are included in this survey. The techniques employed in different implementations are highlighted and compared.  ...  However, their training is often timeconsuming, with training times ranging from multiple hours to weeks.  ...  In deep reinforcement learning, the policy π and the value function Q π (s, a) are usually represented by neural networks.  ... 
doi:10.1109/access.2022.3146518 fatcat:ufrhsktrkza2jjjoi6kdm23rgi

Reinforcement Evolutionary Learning Method for self-learning [article]

Kumarjit Pathak, Jitin Kapila
2018 arXiv   pre-print
Quantitative research is the most widely spread application of data science in Marketing or financial domain where applicability of state of the art reinforcement learning for auto-learning is less explored  ...  There are state of the art methodologies to detect the impact of concept drift, however general strategy considered to overcome the issue in performance is to rebuild or re-calibrate the model periodically  ...  Fearnet uses to complementary memory centers, HC which stores more recent information is a probabilistic neural network. mPFC is described as old memory storage which is a dual purpose DNN (deep Neural  ... 
arXiv:1810.03198v1 fatcat:xdrufyoocbe5ji34pdbdvcn52y

Bayesian Deep Reinforcement Learning via Deep Kernel Learning

Junyu Xuan, Jie Lu, Zheng Yan, Guangquan Zhang
2018 International Journal of Computational Intelligence Systems  
However, such classical deep neural network-based models cannot well handle the uncertainty in sequential decision-making and then limit their learning performance.  ...  As a representative model-free RL algorithm, deep Q-network (DQN) has recently achieved great success on RL problems and even exceed the human performance through introducing deep neural networks.  ...  Specially, the deep kernel learning is a Gaussian process (GP) 16 with a deep kernel modelled by a deep neural network, which has both advantages of deep neural network and GP.  ... 
doi:10.2991/ijcis.2018.25905189 fatcat:ewcbyl27u5gippxdezgf2yamni

Learning offline: memory replay in biological and artificial reinforcement learning

Emma L. Roscow, Raymond Chua, Rui Ponte Costa, Matt W. Jones, Nathan Lepora
2021 Trends in Neurosciences  
Replay is important for memory consolidation in biological neural networks and is key to stabilising learning in deep neural networks.  ...  the understanding of biological and artificial learning and memory.  ...  Deep Q network (DQN): deep neural network that performs Q-learning.  ... 
doi:10.1016/j.tins.2021.07.007 pmid:34481635 fatcat:invan5d3xbbfzf4of5vel3jqfu

Controlling system based on neural networks with reinforcement learning for robotic manipulator

Elena Solovyeva, Ali Abdullah
2020 Information and Control Systems  
Methods: Estimate the action signal's policy by building a numerical algorithm using deep neural networks.  ...  Based on the proposed deep learning method, a model of an agent representing the robotic manipulator was built using four layers neural network for the actor with four layers neural network for the critic  ...  gradient with deep neural networks.  ... 
doi:10.31799/1684-8853-2020-5-24-32 fatcat:bzeiicvkjrd2bcp3iwzuyoyxm4

Model-based actor-critic: GAN (model generator) + DRL (actor-critic) => AGI [article]

Aras Dargazany
2021 arXiv   pre-print
temporal-differencing (TD) error and an episodic memory.  ...  learning the reward function by demonstration.  ...  While most RL algorithms either focus on learning a value function like value iteration and TD-learning (value-based methods), or learning a policy directly such as policy gradient methods (policy-based  ... 
arXiv:2004.04574v9 fatcat:jqzrwh2slnfhpdzz74qseymguy

A Deep Recurrent Q Network towards Self-adapting Distributed Microservices architecture [article]

Basel Magableh
2019 arXiv   pre-print
The performance of DRQN is evaluated against deep Q-learning and policy gradient algorithms including: i) deep q-network (DQN), ii) dulling deep Q-network (DDQN), iii) a policy gradient neural network  ...  (PGNN), and iv) deep deterministic policy gradient (DDPG).  ...  The value w(s, c) is calculated using neural network integrated with deep q-learning/policy gradient algorithms.  ... 
arXiv:1901.04011v2 fatcat:ns44bdnlpvgxhcjddwk4xkpahu

Exploration with Multiple Random ε-Buffers in Off-Policy Deep Reinforcement Learning

Kim, Park
2019 Symmetry  
Deep RL with random ε-greedy policies, such as deep Q-networks (DQNs), can demonstrate efficient exploration behavior.  ...  We demonstrate the benefit of off-policy learning from our method through an experimental comparison of DQN and a deep deterministic policy gradient in terms of discrete action, as well as continuous control  ...  We guarantee similar results for deep learning car simulations online, such as real-time learning with a normal neural network.  ... 
doi:10.3390/sym11111352 fatcat:ecu2fm6lvzd3fgz7pod4qiajfe

Non-Markovian Control with Gated End-to-End Memory Policy Networks [article]

Julien Perez, Tomi Silander
2017 arXiv   pre-print
More precisely, we use a model-free value-based algorithm to learn policies for partially observed domains using this memory-enhanced neural network.  ...  We call the resulting model the Gated End-to-End Memory Policy Network.  ...  Several recent papers successfully apply model-free, direct policy search methods to the problem of learning neural network control policies for challenging continuous domains with many degrees of freedom  ... 
arXiv:1705.10993v1 fatcat:6vzzslvu4jbkjfezth6tdv6nnm

A Survey on Visual Navigation for Artificial Agents with Deep Reinforcement Learning

Fanyu Zeng, Chen Wang, Shuzhi Sam Ge
2020 IEEE Access  
In this paper, we first present an overview on reinforcement learning (RL), deep learning (DL) and deep reinforcement learning (DRL).  ...  Visual navigation for artificial agents with deep reinforcement learning (DRL) is a new research hotspot in artificial intelligence and robotics that incorporates the decision making of DRL into visual  ...  With the rapid development of deep learning [22] , [23] , DeepMind combines deep learning with reinforcement learning, and proposes deep reinforcement learning [13] .  ... 
doi:10.1109/access.2020.3011438 fatcat:ie6qvu24qbapbjxtiudh7fumgy

Deep Reinforcement Learning: An Overview [article]

Yuxi Li
2018 arXiv   pre-print
We start with background of machine learning, deep learning and reinforcement learning.  ...  Next we discuss core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration.  ...  a neural network without read-write memory can not solve.  ... 
arXiv:1701.07274v6 fatcat:x2es3yf3crhqblbbskhxelxf2q

Training a deep policy gradient-based neural network with asynchronous learners on a simulated robotic problem

Winfried Lötzsch, Julien Vitay, Fred Hamker
2017 Jahrestagung der Gesellschaft für Informatik  
Policy gradient methods, such as stochastic policy gradient or deep deterministic policy gradient, propose to overcome this problem by allowing continuous action spaces.  ...  Recent advances in deep reinforcement learning methods have attracted a lot of attention, because of their ability to use raw signals such as video streams as inputs, instead of pre-processed state variables  ...  End-to-end deep RL approaches called policy gradient methods have recently received a lot of attention since they allow to directly learn continuous policies from high dimensional state spaces.  ... 
doi:10.18420/in2017_214 dblp:conf/gi/LotzschVH17 fatcat:fsw2nvbdjjerploxdwje6qfndy

Deep Q Learning in Stabilization of Inverted Pendulum

2019 VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE  
Hence the Q-value function which represents the quality value is approximated by Deep Q-learning that uses a neural network for the same.  ...  In a larger environment, inference of new states from already explored states is a difficult task due to its time & space complexity.  ...  Here comes the necessity of approximating these Q with neural networks. This leads to Deep Q Learning techniques and their merits. II.  ... 
doi:10.35940/ijitee.b6904.129219 fatcat:yj7jipniynggnjtymw226fgfpy
« Previous Showing results 1 — 15 out of 22,119 results