Filters








8,444 Hits in 5.3 sec

Accelerating Quadratic Optimization with Reinforcement Learning [article]

Jeffrey Ichnowski, Paras Jain, Bartolomeo Stellato, Goran Banjac, Michael Luo, Francesco Borrelli, Joseph E. Gonzalez, Ion Stoica, Ken Goldberg
2021 arXiv   pre-print
To address these, we explore how Reinforcement Learning (RL) can learn a policy to tune parameters to accelerate convergence.  ...  First-order methods for quadratic optimization such as OSQP are widely used for large-scale machine learning and embedded optimal control, where many related problems must be rapidly solved.  ...  Acknowledgements This research was performed at the AUTOLAB at UC Berkeley in affiliation with the Berkeley AI Research (BAIR) Lab, and the CITRIS "People and Robots" (CPAR) Initiative.  ... 
arXiv:2107.10847v1 fatcat:4tig7rv74jb7nps4gtawa5z4j4

Automated Driving Maneuvers under Interactive Environment based on Deep Reinforcement Learning [article]

Pin Wang, Ching-Yao Chan, Hanhan Li
2019 arXiv   pre-print
One of the state-of-the-art approaches is to apply Reinforcement Learning (RL) to learn a time-sequential driving policy, to execute proper control strategy or tracking trajectory in dynamic situations  ...  The learning model is present in a closed form of continuous control variables and trained in a simulation platform that we have developed with embedded properties of real-time vehicle interactions.  ...  Reinforcement learning algorithms, with the capability of dealing with time-sequential problems, can seek optimal policies by learning from trials and errors.  ... 
arXiv:1803.09200v3 fatcat:jas3fa7rkfeatpqwcnnlfjtm4e

Quadratic Q-network for Learning Continuous Control for Autonomous Vehicles [article]

Pin Wang, Hanhan Li, Ching-Yao Chan
2019 arXiv   pre-print
Particularly, instead of using a big neural network as Q-function approximation, we design a Quadratic Q-function over actions with multiple simple neural networks for finding optimal values within a continuous  ...  Direct applications of Reinforcement Learning algorithms with discrete action space will yield unsatisfactory results at the operational level of driving where continuous control actions are actually required  ...  With the quadratic format, the optimal action can be obtained easily and analytically.  ... 
arXiv:1912.00074v1 fatcat:2udhnvqtljc4zgegvbtg2sb2bq

A Quadratic Programming Formulation of a Moving Ball Interception and Shooting Behaviour, and its Application to Neural Network Control [chapter]

Frederic Maire, Doug Taylor
2001 Lecture Notes in Computer Science  
We present experimental results showing the validity of the approach and discuss potential applications of this approach i n t h e c o n text of reinforcement learning.  ...  -line by a quadratic programming problem optimiser.  ...  Maire, F.: Bicephal Reinforcement Learning. QUT FIT Technical Report, FIT-TR-00-01. 8.  ... 
doi:10.1007/3-540-45324-5_34 fatcat:usqr2bfzhvfpphxsgdshpdfwlu

Autonomous Ramp Merge Maneuver Based on Reinforcement Learning with Continuous Action Space [article]

Pin Wang, Ching-Yao Chan
2018 arXiv   pre-print
In place of the conventional rule-based approaches, we propose to apply reinforcement learning algorithm on the automated vehicle agent to find an optimal driving policy by maximizing the long-term reward  ...  Most importantly, in contrast to most reinforcement learning applications in which the action space is resolved as discrete, our approach treats the action space as well as the state space as continuous  ...  A reinforcement learning agent learns from past experience and tries to capture the best possible knowledge to find an optimal action given its current state, with the goal of maximizing a long-term reward  ... 
arXiv:1803.09203v1 fatcat:ongn4b3ar5crjlp46xzcfpf7v4

Longitudinal Dynamic versus Kinematic Models for Car-Following Control Using Deep Reinforcement Learning

Yuan Lin, John McPhee, Nasser L. Azad
2019 2019 IEEE Intelligent Transportation Systems Conference (ITSC)  
The majority of current studies on autonomous vehicle control via deep reinforcement learning (DRL) utilize point-mass kinematic models, neglecting vehicle dynamics which includes acceleration delay and  ...  The training results show that the redesigned DRL controller results in near-optimal control performance of car following with vehicle dynamics considered when compared with dynamic programming solutions  ...  Reinforcement learning problem is solved using Bellman's principle of optimality.  ... 
doi:10.1109/itsc.2019.8916781 dblp:conf/itsc/LinMA19 fatcat:rnjsbewftnfqxgb253tfa3cfcq

Continuous Deep Q-Learning with Model-based Acceleration [article]

Shixiang Gu and Timothy Lillicrap and Ilya Sutskever and Sergey Levine
2016 arXiv   pre-print
To further improve the efficiency of our approach, we explore the use of learned models for accelerating model-free reinforcement learning.  ...  NAF representation allows us to apply Q-learning with experience replay to continuous tasks, and substantially improves performance on a set of simulated robotic control tasks.  ...  Model-free reinforcement learning in domains with contin-arXiv:1603.00748v1 [cs.  ... 
arXiv:1603.00748v1 fatcat:jdrmstwwm5au7lrper5r5dicum

Tracking Control of Intelligent Vehicle Lane Change Based on RLMPC

Quanshan Hou, Yanan Zhang, Shuai Zhao, Yunhao Hu, Yongwang Shen, L. Zhang, S. Defilla, W. Chu
2021 E3S Web of Conferences  
In this paper, combined with the excellent self-learning ability of reinforcement learning, an interactive model predictive control algorithm is designed to realize the tracking control of the lane change  ...  Traditional controllers have disadvantages such as weak scene adaptability and difficulty in balancing multi-objective optimization.  ...  Reinforcement learning has the ability of interactive learning with the external environment, which makes the MPC prediction model based on reinforcement learning more accurate prediction effects and has  ... 
doi:10.1051/e3sconf/202123304019 fatcat:d6lejkse35cazp6hvwsummpwoi

Reinforcement learning by reward-weighted regression for operational space control

Jan Peters, Stefan Schaal
2007 Proceedings of the 24th international conference on Machine learning - ICML '07  
Using a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with  ...  However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains  ...  With τ being adaptive, significantly faster learning of the optimal policy is achieved.  ... 
doi:10.1145/1273496.1273590 dblp:conf/icml/PetersS07 fatcat:tsfu5dzeinedbahnimd7lyjifa

A Reinforcement Learning Based Approach for Automated Lane Change Maneuvers

Pin Wang, Ching-Yao Chan, Arnaud de La Fortelle
2018 2018 IEEE Intelligent Vehicles Symposium (IV)  
In our study, we propose a Reinforcement Learning based approach to train the vehicle agent to learn an automated lane change behavior such that it can intelligently make a lane change under diverse and  ...  Extensive simulations are conducted for training the algorithms, and the results illustrate that the Reinforcement Learning based vehicle agent is capable of learning a smooth and efficient driving policy  ...  Reinforcement learning, one promising category in the machine learning family, has the capability of dealing with time-sequential problems and seeking optimal policies for long-term objectives by learning  ... 
doi:10.1109/ivs.2018.8500556 dblp:conf/ivs/WangCF18 fatcat:q6ldxtygnvhflcrcg5zqlohw4e

Using Reward-weighted Regression for Reinforcement Learning of Task Space Control

Jan Peters, Stefan Schaal
2007 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning  
Using a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with  ...  However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains  ...  With τ being adaptive, significantly faster learning of the optimal policy is achieved.  ... 
doi:10.1109/adprl.2007.368197 fatcat:4d4rryxokrf3tftqg2bhoqjtdq

Online Reinforcement Learning-Based Control of an Active Suspension System Using the Actor Critic Approach

Ahmad Fares, Ahmad Bani Younes
2020 Applied Sciences  
In this paper, a controller learns to adaptively control an active suspension system using reinforcement learning without prior knowledge of the environment.  ...  The performance of the controller is compared with the Linear Quadratic Regulator (LQR) and optimum Proportional-Integral-Derivative (PID), and the adaptiveness is tested by estimating some of the system's  ...  The learning algorithm implemented by [13] was able to achieve near optimal results compared with the Linear Quadratic Gaussian (LQG) under idealized conditions.  ... 
doi:10.3390/app10228060 fatcat:t3a2sa527ffonophc2u3y5f5ye

Driving Decision and Control for Autonomous Lane Change based on Deep Reinforcement Learning [article]

Tianyu Shi, Pin Wang, Xuxin Cheng, Ching-Yao Chan, Ding Huang
2019 arXiv   pre-print
Furthermore, we design two similar Deep Q learning frameworks with quadratic approximator for deciding how to select a comfortable gap and just follow the preceding vehicle.  ...  We apply Deep Q-network (DQN) with the consideration of safety during the task for deciding whether to conduct the maneuver.  ...  We defined the continuous action space of the 'preparation layer' with longitudinal acceleration as: = ∈ (3) For these functions, we leverage Deep Reinforcement Learning in each module, within a hierarchical  ... 
arXiv:1904.10171v2 fatcat:ey22jtjj2vajrlf355rj2b7gee

The Design of Performance Guaranteed Autonomous Vehicle Control for Optimal Motion in Unsignalized Intersections

Balázs Németh, Péter Gáspár
2021 Applied Sciences  
First, an environment model for the intersection was created based on a constrained quadratic optimization, with which guarantees on collision avoidance can be provided.  ...  Second, the environment model was used in the training process, which was based on a reinforcement learning method.  ...  ., with the extension of the reward function in the reinforcement learning process.  ... 
doi:10.3390/app11083464 fatcat:dfb7x76qcnbojilully3wgvq3a

Reinforcement Learning for Operational Space Control

Jan Peters, Stefan Schaal
2007 Engineering of Complex Computer Systems (ICECCS), Proceedings of the IEEE International Conference on  
Using a generalization of the EM-based reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with  ...  However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains  ...  With τ being adaptive, significantly faster learning of the optimal policy is achieved.  ... 
doi:10.1109/robot.2007.363633 dblp:conf/icra/PetersS07 fatcat:hkkohfhqonc4vbcx3q6nt3l5j4
« Previous Showing results 1 — 15 out of 8,444 results