365 Hits in 7.6 sec

Blending MPC Value Function Approximation for Efficient Reinforcement Learning [article]

Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots
2021 arXiv   pre-print
We present a framework for improving on MPC with model-free reinforcement learning (RL). The key insight is to view MPC as constructing a series of local Q-function approximations.  ...  We show that by using a parameter λ, similar to the trace decay parameter in TD(λ), we can systematically trade-off learned value estimates against the local Q-function approximations.  ...  MITIGATING BIAS IN MPC VIA REINFORCEMENT LEARNING In this section, we develop our approach to systematically deal with model bias in MPC by blending-in learned value estimates.  ... 
arXiv:2012.05909v2 fatcat:jbd52k2g65ahdotaohtmom3rrm

Combining Reinforcement Learning with Model Predictive Control for On-Ramp Merging [article]

Joseph Lubars, Harsh Gupta, Sandeep Chinchali, Liyun Li, Adnan Raja, R. Srikant, Xinzhou Wu
2021 arXiv   pre-print
Two broad classes of techniques have been proposed to solve motion planning problems in autonomous driving: Model Predictive Control (MPC) and Reinforcement Learning (RL).  ...  We subsequently present an algorithm which blends the model-free RL agent with the MPC solution and show that it provides better trade-offs between all metrics -- passenger comfort, efficiency, crash rate  ...  ) and Reinforcement Learning (RL), perform for this problem.  ... 
arXiv:2011.08484v3 fatcat:rq573i743rg7blmsykdzfnibky

Towards a Smarter Energy Management System for Hybrid Vehicles: A Comprehensive Review of Control Strategies

Xu, Kong, Chu, Ju, Yang, Xu, Xu
2019 Applied Sciences  
Based on neural network and the large data processing technology, data-driven strategies are put forward due to their approximate optimality and high computational efficiency.  ...  This paper not only provides a comprehensive analysis of energy management control strategies for HEVs, but also presents the emphasis in the future.  ...  Reinforcement learning, as a new research hotspot, uses neural networks to approximate the Q function.  ... 
doi:10.3390/app9102026 fatcat:tbkqyk26grh55e3yjxl4jag3ny

Tailored neural networks for learning optimal value functions in MPC [article]

Dieter Teichrib, Moritz Schulze Darup
2021 arXiv   pre-print
However, efficiently learning the optimal control policy, the optimal value function, or the Q-function requires suitable function approximators.  ...  In this paper, we provide a similar result for representing the optimal value function and the Q-function that are both known to be piecewise quadratic for linear MPC.  ...  Blending MPC and −1 +0 −1 +1 +0 −1 value function approximation for efficient reinforcement learning. +0 −1 +0 −1 −1  ... 
arXiv:2112.03975v1 fatcat:wexhgbh2kvcjvitbqfwiadvbnq

Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning [article]

Lukas Brunke, Melissa Greeff, Adam W. Hall, Zhaocong Yuan, Siqi Zhou, Jacopo Panerati, Angela P. Schoellig
2021 arXiv   pre-print
The last half-decade has seen a steep rise in the number of contributions on safe learning methods for real-world robotic deployments from both the control and reinforcement learning communities.  ...  control and reinforcement learning approaches.  ...  Learning Backward Value Functions.  ... 
arXiv:2108.06266v2 fatcat:gbbe3qyatfgelgzhqzglecr5qm

A Survey on Learning-Based Model Predictive Control: Toward Path Tracking Control of Mobile Platforms

Kanghua Zhang, Jixin Wang, Xueting Xin, Xiang Li, Chuanwen Sun, Jianfei Huang, Weikang Kong
2022 Applied Sciences  
Furthermore, some research challenges faced by LB-MPC for path tracking control in mobile platforms are discussed.  ...  The model predictive control (MPC) provides an integrated solution for control systems with interactive variables, complex dynamics, and various constraints.  ...  Acknowledgments: The authors would like to thank all anonymous reviewers and editors for their helpful suggestions for the improvement of this paper.  ... 
doi:10.3390/app12041995 fatcat:hdy753pdbfhjvihdopaw6lip2u

Hierarchical Evasive Path Planning using Reinforcement Learning and Model Predictive Control

Arpad Feher, Szilard Aradi, Tamas Becsi
2020 IEEE Access  
This paper presents a solution based on Reinforcement Learning for a point-free avoidance path generation for the emergency double-lane change situation.  ...  The agent achieves greater efficiency in higher speed cases with this solution. µ max = 0.0037e v0 0.0693 The reward function (10) subtracts the maximum values that occur during the route's execution  ... 
doi:10.1109/access.2020.3031037 fatcat:w6bemcaf3jdyjni6lio5koeqiu

Analyzing the Improvements of Energy Management Systems for Hybrid Electric Vehicles Using a Systematic Literature Review: How Far Are These Controls from Rule-Based Controls Used in Commercial Vehicles?

Juan P. Torreglosa, Pablo Garcia-Triviño, David Vera, Diego A. López-García
2020 Applied Sciences  
This work presents a systematic literature review (SLR) of the more recent works that developed EMSs for HEVs.  ...  predictive control, RL-reinforcement learning).  ...  Some interesting sub-categories of learning-based strategies are reinforcement learning (RL) and neural network learning (NNL).  ... 
doi:10.3390/app10238744 fatcat:jacrh6igaffm3etle65xbs6pnm

Evaluating model-based planning and planner amortization for continuous control [article]

Arunkumar Byravan, Leonard Hasenclever, Piotr Trochim, Mehdi Mirza, Alessandro Davide Ialongo, Yuval Tassa, Jost Tobias Springenberg, Abbas Abdolmaleki, Nicolas Heess, Josh Merel, Martin Riedmiller
2021 arXiv   pre-print
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning; the learned policy serves as a proposal for MPC.  ...  We find that well-tuned model-free agents are strong baselines even for high DoF control problems but MPC with learned proposals and models (trained on the fly or transferred from related tasks) can significantly  ...  Other recently proposed algorithmic innovations blend MPC with learned value estimates to trade off model and value errors (Bhardwaj et al., 2021) .  ... 
arXiv:2110.03363v1 fatcat:do6vywo47jdf3jzcjabw5e6ngm

Thorough state-of-the-art analysis of electric and hybrid vehicle powertrains: Topologies and integrated energy management strategies

Dai-Duong Tran, Majid Vafaeipour, Mohamed El Baghdadi, Ricardo Barrero, Joeri Van Mierlo, Omar Hegazy
2019 Renewable & Sustainable Energy Reviews  
Achieving an energy-efficient powertrain requires tackling several conflicting control objectives such as the drivability, fuel economy, reduced emissions, and battery state of charge preservation, which  ...  Full-electric vehicle Energy management strategy optimisation Online EMS Offline EMS Optimal control strategy A B S T R A C T Hybrid and electric vehicles have been demonstrated as auspicious solutions for  ...  We also acknowledge Flanders Make and VLAIO for the support of our research group.  ... 
doi:10.1016/j.rser.2019.109596 fatcat:ybks774km5htvb6f7foyetum7m

Reinforcement Learning Optimized Look-Ahead Energy Management of a Parallel Hybrid Electric Vehicle

Teng Liu, Xiaosong Hu, Shengbo Eben Li, Dongpu Cao
2017 IEEE/ASME transactions on mechatronics  
This paper presents a predictive energy management strategy for a parallel hybrid electric vehicle (HEV) based on velocity prediction and reinforcement learning (RL).  ...  The design procedure starts with modeling the parallel HEV as a systematic control-oriented model and defining a cost function.  ...  VELOCITY PREDICTION AND REINFORCEMENT LEARNING A.  ... 
doi:10.1109/tmech.2017.2707338 fatcat:yv6bliu6orenjfedacino7fesa

Adaptive Probabilistic Trajectory Optimization via Efficient Approximate Inference [article]

Yunpeng Pan, Xinyan Yan, Evangelos Theodorou, Byron Boots
2016 arXiv   pre-print
While Reinforcement Learning (RL) can be used to compute optimal policies with little prior knowledge about the environment, it suffers from slow convergence.  ...  Our method uses scalable approximate inference to learn and updates probabilistic models in an online incremental fashion while also computing optimal control policies via successive local approximations  ...  ) a second-order local approximation of the value function.  ... 
arXiv:1608.06235v2 fatcat:hj4ghkf5djbwbcbd5ualtgw4v4

A Tour of Reinforcement Learning: The View from Continuous Control [article]

Benjamin Recht
2018 arXiv   pre-print
This manuscript surveys reinforcement learning from the perspective of optimization and control with a focus on continuous control applications.  ...  In particular, theory and experiment demonstrate the role and importance of models and the cost of generality in reinforcement learning algorithms.  ...  I'd like to thank Chris Wiggins for sharing his taxonomy on machine learning, Roy Frostig for shaping the Section 3.3, Pavel Pravdin for consulting on how to get policy gradient methods up and running,  ... 
arXiv:1806.09460v2 fatcat:4ago566hwzcudj3mcbb3j5wyr4

Intelligent Energy Management Systems for Electrified Vehicles: Current Status, Challenges, and Emerging Trends

Reihaneh Ostadian, John Ramoul, Atriya Biswas, Ali Emadi
2020 IEEE Open Journal of Vehicular Technology  
Index Terms-data-driven methods, electric vehicles, intelligent energy management strategy, reinforcement learning, powertrain architecture.  ...  By increasing the electrified vehicles trend, it is crucial to have a survey of what is done before and what is the open challenge for improving this industry.  ...  DRL employs deep neural network (DNN) in order to express state value function V (S t ), action value function Q(S t , A t ), and policy function π(S) with function approximation instead of tabular approach  ... 
doi:10.1109/ojvt.2020.3018146 fatcat:7pqrnf52nze5fhpuyyhz4mjvtm

Reinforcement Learning and Deep Learning based Lateral Control for Autonomous Driving [article]

Dong Li, Dongbin Zhao, Qichao Zhang, Yaran Chen
2018 arXiv   pre-print
In order to improve the data efficiency, we propose visual TORCS (VTORCS), a deep reinforcement learning environment which is based on the open racing car simulator (TORCS).  ...  The trained reinforcement learning controller outperforms the linear quadratic regulator (LQR) controller and model predictive control (MPC) controller on different tracks.  ...  The critic Q µ (s t , a t ; w Q 2 ) where w Q 2 are the network weights approximates the optimal action-value function.  ... 
arXiv:1810.12778v1 fatcat:2yz3olk2v5g6tpxb2u67tozhpy
« Previous Showing results 1 — 15 out of 365 results