11,119 Hits in 7.9 sec

Model learning actor-critic algorithms: Performance evaluation in a motion control task

Ivo Grondman, Lucian Busoniu, Robert Babuska
2012 2012 IEEE 51st IEEE Conference on Decision and Control (CDC)  
In the literature, modelbased actor-critic algorithms have recently been introduced to considerably speed up the the learning by constructing online a model through local linear regression (LLR).  ...  Reinforcement learning (RL) control provides a means to deal with uncertainty and nonlinearity associated with control tasks in an optimal way.  ...  Model Learning Actor-Critic In addition to learning the actor and critic functions, the Model Learning Actor-Critic (MLAC) method learns an approximate process model x =f ζ (x, u).  ... 
doi:10.1109/cdc.2012.6426427 dblp:conf/cdc/GrondmanBB12 fatcat:iihtxvfeg5bdfg6c4bsvksf4jm

Actor-Critic Control with Reference Model Learning

Ivo Grondman, Maarten Vaandrager Lucian Busoniu, Robert Babuska, Erik Schuitema
2011 IFAC Proceedings Volumes  
We propose a new actor-critic algorithm for reinforcement learning.  ...  The algorithm does not use an explicit actor, but learns a reference model which represents a desired behaviour, along which the process is to be controlled by using the inverse of a learned process model  ...  EXAMPLE: PENDULUM SWINGUP To evaluate and compare the performance of our algorithm, we apply it to the task of learning to swing up a simulated inverted pendulum and compare it to the standard algorithm  ... 
doi:10.3182/20110828-6-it-1002.00759 fatcat:b6wr37lrm5cnznr7fic3b46pcm

On the Role of Models in Learning Control: Actor-Critic Iterative Learning Control [article]

Maurice Poot, Jim Portegies, Tom Oomen
2020 arXiv   pre-print
These basis functions encode implicit model knowledge and the actor-critic algorithm learns the feedforward parameters without explicitly using a model.  ...  The developed actor-critic iterative learning control (ACILC) framework uses a feedforward parameterization with basis functions.  ...  Actor-critic learning The actor-critic algorithm for the solution of the optimal control problem is presented here.  ... 
arXiv:2007.00430v2 fatcat:4fwlzluvkjh4xbwse5oz4k3jta

Model-based Actor-critic Learning of Robotic Impedance Control in Complex Interactive Environment

Xingwei Zhao, Shibo Han, Bo Tao, Zhou-Ping Yin, Han Ding
2021 IEEE transactions on industrial electronics (1982. Print)  
To learn the interactive skill, a modelbased actor-critic learning algorithm and a safety-learning strategy are proposed in this paper to find the optimal impedance control, in which the learning process  ...  The effectiveness of the learning algorithm and the performance of the learned impedance control are validated in a UR5 robot.  ...  To realize RL of robotic impedance control, a model-based actor-critic (AC) learning algorithm and a safety-learning strategy are proposed in this paper.  ... 
doi:10.1109/tie.2021.3134082 fatcat:twbmcfnvi5aw5kamnatsc7hwfy

Reinforcement Learning Algorithms in Humanoid Robotics [chapter]

Dusko Katic, Miomir Vukobratovic
2007 Humanoid Robots: New Developments  
Finally, one ought to point out that the problem of motion of humanoid robots is a very complex control task, especially when the real environment is taken into account, requiring as a minimum, its integration  ...  All the mentioned characteristics have to be taken into account in the synthesis of advanced control algorithms that accomplish stable, fast and reliable performance of humanoid robots.  ...  Acknowledgments The work described in this conducted was conducted within the national research project "Dynamic and Control of High-Performance Humanoid Robots: Theory and Application". and was funded  ... 
doi:10.5772/4878 fatcat:brp43tnj35c2lmalaawtv5gv6u

UAV maneuvering decision -making algorithm based on Twin Delayed Deep Deterministic Policy Gradient Algorithm

Shuangxia Bai, Shaomei Song, Shiyang Liang, Jianmei Wang, Bo Li, Evgeny Neretin
2021 Journal of Artificial Intelligence and Technology  
The Twin Delayed Deep Deterministic Policy Gradient(TD3) algorithm and the Deep Deterministic Policy Gradient (DDPG) algorithm in deep reinforcement learning are used to train the model, and the experimental  ...  Aiming at intelligent decision-making of UAV based on situation information in air combat, a novel maneuvering decision method based on deep reinforcement learning is proposed in this paper.  ...  , which makes TD3 algorithm perform better than DDPG in many continuous control tasks.  ... 
doi:10.37965/jait.2021.12003 fatcat:nq3r65srfze43pnfias4oiri4u

Motion Control of Unmanned Underwater Vehicles Via Deep Imitation Reinforcement Learning Algorithm

Zhenzhong Chu, Bo Sun, Daqi Zhu, Mingjun Zhang, Chaomin Luo
2020 IET Intelligent Transport Systems  
In this study, a motion control algorithm based on deep imitation reinforcement learning is proposed for the unmanned underwater vehicles (UUVs).  ...  The deep reinforcement learning employs actor-critic architecture. The actor part executes the control strategy and the critic part evaluates current control strategy.  ...  The motion control of UUV is a continuous control task, so the TD3 based on actor-critic architecture is adopted.  ... 
doi:10.1049/iet-its.2019.0273 fatcat:3fjyronvanh6zlhkhur3gjg6wq

Soft Actor-Critic Algorithms and Applications [article]

Tuomas Haarnoja and Aurick Zhou and Kristian Hartikainen and George Tucker and Sehoon Ha and Jie Tan and Vikash Kumar and Henry Zhu and Abhishek Gupta and Pieter Abbeel and Sergey Levine
2019 arXiv   pre-print
Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks.  ...  In this paper, we describe Soft Actor-Critic (SAC), our recently introduced off-policy actor-critic algorithm based on the maximum entropy RL framework.  ...  Acknowledgments We would like to thank Vitchyr Pong and Haoran Tang for constructive discussions during the development of soft actor-critic, Vincent Vanhoucke for his support towards the project at Google  ... 
arXiv:1812.05905v2 fatcat:dw4325vci5ezzpfyfrtulkpko4

Smart Train Operation Algorithms based on Expert Knowledge and Reinforcement Learning [article]

Kaichen Zhou, Shiji Song, Anke Xue, Keyou You, Hui Wu
2021 arXiv   pre-print
Compared with previous works, the proposed algorithms can realize the control of continuous action for the subway system and optimize multiple critical objectives without using an offline speed profile  ...  operation algorithm are better than expert manual driving and existing ATO algorithms in terms of energy efficiency.  ...  These concerns make energy efficiency play a core actor in our control model designing.  ... 
arXiv:2003.03327v3 fatcat:qa6pddzifjghzmwru3yv5cr2im

A Survey of Deep Reinforcement Learning Algorithms for Motion Planning and Control of Autonomous Vehicles [article]

Fei Ye, Shen Zhang, Pin Wang, Ching-Yao Chan
2021 arXiv   pre-print
In this survey, we systematically summarize the current literature on studies that apply reinforcement learning (RL) to the motion planning and control of autonomous vehicles.  ...  However, this approach does not automatically guarantee maximal performance due to the lack of a system-level optimization.  ...  Actor-Critic algorithm has been successfully applied in both discrete behavior decision makings and continuous motion planning tasks [15] , [16] . III.  ... 
arXiv:2105.14218v2 fatcat:27glt4i4lfhg3j4ozjrlsq6i3e

Machine Learning Algorithms in Bipedal Robot Control

Shouyi Wang, Wanpracha Chaovalitwongse, Robert Babuska
2012 IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews)  
This paper gives a review of recent advances on the stateof-the-art learning algorithms and their applications to bipedal robot control.  ...  In the past decades, machine learning techniques, such as supervised learning, reinforcement learning, and unsupervised learning, have been increasingly used in the control engineering community.  ...  With a classical control approach, a robot is explicitly programmed to perform the desired task using a complete mathematical model of the robot and its environment.  ... 
doi:10.1109/tsmcc.2012.2186565 fatcat:tchoesxg6rc2vkuh7gtlxxfosa

Research on the Multiagent Joint Proximal Policy Optimization Algorithm Controlling Cooperative Fixed-Wing UAV Obstacle Avoidance

Weiwei Zhao, Hairong Chu, Xikui Miao, Lihong Guo, Honghai Shen, Chenhao Zhu, Feng Zhang, Dongxin Liang
2020 Sensors  
To evaluate the performance of the algorithm, we use the MAJPPO algorithm to complete the task of multi-UAV formation and the crossing of multiple-obstacle environments.  ...  Aiming at the problem of a non-stationary environment caused by the change of learning agent strategy in reinforcement learning in a multi-agent environment, the paper presents an improved multiagent reinforcement  ...  In order to learn a stable control model, a reasonable reward function structure is necessary.  ... 
doi:10.3390/s20164546 pmid:32823783 fatcat:eqielsplina5xeyenrosylf3cu

Real Time Robot Policy Adaptation Based on Intelligent Algorithms [chapter]

Genci Capi, Hideki Toda, Shin-Ichiro Kaneko
2011 IFIP Advances in Information and Communication Technology  
The proposed algorithm is evaluated in the simulated environment of the Cyber Rodent (CR) robot, where the robot has to increase its energy level by capturing the active battery packs.  ...  In this paper we present a new method for robot real time policy adaptation by combining learning and evolution. The robot adapts the policy as the environment conditions change.  ...  In order to test the effectiveness of the proposed algorithm, we considered a biologically inspired task for the CR robot ( [13] ).  ... 
doi:10.1007/978-3-642-23960-1_1 fatcat:4uiz3gxkvnaqrcfh6l5q6yokwe

Reinforcement Learning Under Algorithmic Triage [article]

Eleni Straitouri, Adish Singla, Vahid Balazadeh Meresht, Manuel Gomez-Rodriguez
2021 arXiv   pre-print
To this end, we look at the problem through the framework of options and develop a two-stage actor-critic method to learn reinforcement learning models under triage.  ...  In this work, we take a first step towards developing reinforcement learning models that are optimized to operate under algorithmic triage.  ...  In learning under algorithmic triage, one does not only has to find a machine learning model but also a triage policy which determines who decides, the model or the human, and when.  ... 
arXiv:2109.11328v1 fatcat:suftzvuqtba45pk5yyxmm6zloi

Adaptive Inverse Optimal Control for Rehabilitation Robot Systems Using Actor-Critic Algorithm

Fancheng Meng, Yaping Dai
2014 Mathematical Problems in Engineering  
To this goal, a new adaptive inverse optimal hybrid control (AHC) combining inverse optimal control and actor-critic learning is proposed.  ...  Then, based on this model, an open-loop error system is formed; thereafter, an inverse optimal control input is designed to minimize the cost functional and a NN-based actor-critic feedforward signal is  ...  Then, according to the evaluation result, we make a suitable training task, as shown in Figure 8 .  ... 
doi:10.1155/2014/285248 fatcat:rrj4mynikzdebduulpylh5ygrq
« Previous Showing results 1 — 15 out of 11,119 results