Filters








39,002 Hits in 1.7 sec

A deep reinforcement learning model based on deterministic policy gradient for collective neural crest cell migration [article]

Yihao Zhang, Zhaojie Chai, Yubing Sun, George Lykotrafitis
2020 arXiv   pre-print
Here, we propose a novel deep reinforcement learning model for collective neural crest cell migration.  ...  We apply the deep deterministic policy gradient algorithm in association with a particle dynamics simulation environment to train agents to determine the migration path.  ...  DRL considers both policy learning and policy improvement, which has shown great potential in cell biomechanics.  ... 
arXiv:2007.03190v1 fatcat:uwa5vmfbjralpji3pmip4l3fna

Federated Reinforcement Learning Acceleration method for Precise Control of Multiple Devices

Hyun-Kyo Lim, Ju-Bong Kim, Ihsan Ullah, Joo-Seong Heo, Youn-Hee Han
2021 IEEE Access  
Based on the proposed federation policy all agents shared their learning experience (e.g., gradient and model parameters) to update the learning level.  ...  In Section IV, the federation policy of the weight-based gradient sharing and transfer learning are described.  ... 
doi:10.1109/access.2021.3083087 fatcat:em44panxureg7p46kpf3h274xe

Model-Based Reinforcement Learning [chapter]

Soumya Ray, Prasad Tadepalli
2014 Encyclopedia of Machine Learning and Data Mining  
Model-based Reinforcement Learning refers to learning optimal behavior indirectly by learning a model of the environment by taking actions and observing the outcomes that include the next state and the  ...  The models predict the outcomes of actions and are used in lieu of or in addition to interaction with the environment to learn optimal policies.  ...  Learning, and Model-free Reinforcement Learning.  ... 
doi:10.1007/978-1-4899-7502-7_561-1 fatcat:4pwzznqsefhq3e2oqs2mavvxp4

Model-Based Reinforcement Learning [chapter]

Soumya Ray, Prasad Tadepalli
2017 Encyclopedia of Machine Learning and Data Mining  
Indirect Reinforcement Learning Definition Model-based Reinforcement Learning refers to learning optimal behavior indirectly by learning a model of the environment by taking actions and observing the outcomes  ...  Assuming that the number of states is not exceedingly high, this suggests a straightforward approach for model-based reinforcement learning.  ...  Synonyms Indirect Reinforcement Learning Definition Model-based Reinforcement Learning refers to learning optimal behavior indirectly by learning a model of the environment by taking actions and observing  ... 
doi:10.1007/978-1-4899-7687-1_561 fatcat:klqdisxo4bffhelmyckzzsqz64

Model-Ensemble Trust-Region Policy Optimization [article]

Thanard Kurutach, Ignasi Clavera, Yan Duan, Aviv Tamar, Pieter Abbeel
2018 arXiv   pre-print
In this paper, we analyze the behavior of vanilla model-based reinforcement learning methods when deep neural networks are used to learn both the model and the policy, and show that the learned policy  ...  Alternatively, model-based reinforcement learning promises to reduce sample complexity, but tends to require careful tuning and to date have succeeded mainly in restrictive domains where simple models  ...  VANILLA MODEL-BASED DEEP REINFORCEMENT LEARNING In the most successful methods of model-free reinforcement learning, we iteratively collect data, estimate the gradient of the policy, improve the policy  ... 
arXiv:1802.10592v2 fatcat:2p2vevibdraf3ehcpkqjgnunni

Model-Based Reinforcement Learning [chapter]

Johannes Fürnkranz, Philip K. Chan, Susan Craw, Claude Sammut, William Uther, Adwait Ratnaparkhi, Xin Jin, Jiawei Han, Ying Yang, Katharina Morik, Marco Dorigo, Mauro Birattari (+24 others)
2011 Encyclopedia of Machine Learning  
Model-based Reinforcement Learning refers to learning optimal behavior indirectly by learning a model of the environment by taking actions and observing the outcomes that include the next state and the  ...  The models predict the outcomes of actions and are used in lieu of or in addition to interaction with the environment to learn optimal policies.  ...  Learning, and Model-free Reinforcement Learning.  ... 
doi:10.1007/978-0-387-30164-8_556 fatcat:5m7w3apnrzeoxiirsx4dl2vcrq

Potential Field Guided Actor-Critic Reinforcement Learning [article]

Weiya Ren
2020 arXiv   pre-print
This can be seen as a combination of the model-based gradients and the model-free gradients in policy improvement.  ...  In this paper, we consider the problem of actor-critic reinforcement learning.  ...  By using learned model to do imagination rollouts to accelerate the learning or to get better estimates of action-value functions, model-based reinforcement learning methods [6] [7] [8] allow for more  ... 
arXiv:2006.06923v1 fatcat:f4istxy2yrehbiodzi37j2ed7e

A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions [article]

Xiaocong Chen, Lina Yao, Julian McAuley, Guanglin Zhou, Xianzhi Wang
2021 arXiv   pre-print
of the recent trends of deep reinforcement learning in recommender systems.  ...  In light of the emergence of deep reinforcement learning (DRL) in recommender systems research and several fruitful results in recent years, this survey aims to provide a timely and comprehensive overview  ...  Model-Based Deep Reinforcement Learning Model-Free Deep Reinforcement Learning Value-based Methods Hybrid Methods Deep Reinforcement Learning Policy-based Methods Learn the Model Fig. 2.  ... 
arXiv:2109.03540v2 fatcat:5gwrbfcj3rc7jfkd54eseck5ga

Policy Gradient Methods [chapter]

Jan Peters, J. Andrew Bagnell
2016 Encyclopedia of Machine Learning and Data Mining  
A policy gradient method is a reinforcement learning approach that directly optimizes a parametrized control policy by gradient descent.  ...  In optimal control, model-based gradient methods have been used for optimizing policies since the late 1960s.  ... 
doi:10.1007/978-1-4899-7502-7_646-1 fatcat:kxs2bj7mrref5d2a55xmh7q7uq

Orthogonal Policy Gradient and Autonomous Driving Application [article]

Mincong Luo, Yin Tong, Jiachi Liu
2018 arXiv   pre-print
In this paper we present an approach called orthogonal policy gradient descent(OPGD) that can make agent learn the policy gradient based on the current state and the actions set, by which the agent can  ...  Fortunately, deep reinforcement learning has enabled enormous progress in both subproblems: giving the correct strategy and evaluating all actions based on the state.  ...  CONCLUSION In this work we developed a simple and effective model for deep reinforcement learning based on orthogonal policy gradient descent(OPGD).  ... 
arXiv:1811.06151v1 fatcat:lyorxhjcond5ng6zxhwb756b7y

Reinforcement Learning of Speech Recognition System Based on Policy Gradient and Hypothesis Selection [article]

Taku Kato, Takahiro Shinozaki
2017 arXiv   pre-print
In this paper, we propose a general reinforcement learning framework for speech recognition systems based on the policy gradient method.  ...  As a particular instance of the framework, we also propose a hypothesis selection-based reinforcement learning method.  ...  The two major formalizations of reinforcement learning are value-based methods including Q-learning approaches [8, 9, 10] , and policy-based methods including policy gradient methods [11, 12] .  ... 
arXiv:1711.03689v1 fatcat:xkusudrakfesxh7kulwvuo243e

Policy gradient methods

Jan Peters
2010 Scholarpedia  
A policy gradient method is a reinforcement learning approach that directly optimizes a parametrized control policy by gradient descent.  ...  In optimal control, model-based gradient methods have been used for optimizing policies since the late 1960s.  ... 
doi:10.4249/scholarpedia.3698 fatcat:vpn346nvkneq3bswbs5xxdvmzu

A Study of Continuous Maximum Entropy Deep Inverse Reinforcement Learning

Xi-liang Chen, Lei Cao, Zhi-xiong Xu, Jun Lai, Chen-xi Li
2019 Mathematical Problems in Engineering  
We propose a continuous maximum entropy deep inverse reinforcement learning algorithm for continuous state space and continues action space, which realizes the depth cognition of the environment model  ...  Empirical results on classical control environments on OpenAI Gym: MountainCarContinues-v0 show that our approach is able to learn policies faster and better.  ...  using the sampled policy gradient: Maximum Entropy Deep Inverse reinforcement learning with Hot start.  ... 
doi:10.1155/2019/4834516 fatcat:khbayb5zbrhonhohw2jpwkuvoa

A Survey of Deep Reinforcement Learning in Video Games [article]

Kun Shao, Zhentao Tang, Yuanheng Zhu, Nannan Li, Dongbin Zhao
2019 arXiv   pre-print
In this paper, we survey the progress of DRL methods, including value-based, policy gradient, and model-based algorithms, and compare their main techniques and properties.  ...  Deep reinforcement learning (DRL) has made great achievements since proposed.  ...  In Section III, we focus on recent DRL methods, including value-based, policy gradient, and model-based DRL methods.  ... 
arXiv:1912.10944v2 fatcat:fsuzp2sjrfcgfkyclrsyzflax4

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions [article]

Amit Kumar Mondal
2020 arXiv   pre-print
My analysis pointed out that most of the models focused on tuning policy values rather than tuning other things in a particular state of reasoning.  ...  Reinforcement learning influences the system to take actions within an arbitrary environment either having previous knowledge about the environment model or not.  ...  Another interesting model called UNREAL is relied on the policy gradient concept, and updated the actor-critic model (base is TD method).  ... 
arXiv:2001.06921v2 fatcat:uwqn4jmginf73ouk3zmm45uozy
« Previous Showing results 1 — 15 out of 39,002 results