Filters








7,956 Hits in 6.6 sec

Learning the model-free linear quadratic regulator via random search

Hesameddin Mohammadi, Mihailo R. Jovanovic, Mahdi Soltanolkotabi
2020 Conference on Learning for Dynamics & Control  
Model-free reinforcement learning techniques attempt to find an optimal control action for an unknown dynamical system by directly searching over the parameter space of controllers.  ...  In this paper, we examine the standard infinite-horizon linear quadratic regulator problem for continuous-time systems with unknown state-space parameters.  ...  the linear quadratic regulator (LQR).  ... 
dblp:conf/l4dc/MohammadiJS20 fatcat:654duzegs5d7him6tepoo4nmwi

Recovering Robustness in Model-Free Reinforcement learning [article]

Harish K. Venkataraman, Peter J. Seiler
2019 arXiv   pre-print
Reinforcement learning (RL) is used to directly design a control policy using data collected from the system. This paper considers the robustness of controllers trained via model-free RL.  ...  The discussion focuses on the standard model-based linear quadratic Gaussian (LQG) problem as a special instance of RL.  ...  L are the optimal linear quadratic regulator (LQR) and Kalman filter gains.  ... 
arXiv:1810.09337v3 fatcat:rdngooeuh5dgppg5b3jlg37r7y

A Tour of Reinforcement Learning: The View from Continuous Control [article]

Benjamin Recht
2018 arXiv   pre-print
In order to compare the relative merits of various techniques, this survey presents a case study of the Linear Quadratic Regulator (LQR) with unknown dynamics, perhaps the simplest and best-studied problem  ...  In particular, theory and experiment demonstrate the role and importance of models and the cost of generality in reinforcement learning algorithms.  ...  First, this work was generously supported in part by two forward-looking programs at DOD: the Mathematical Data Science program at ONR and the Foundations and Limits of Learning program at DARPA.  ... 
arXiv:1806.09460v2 fatcat:4ago566hwzcudj3mcbb3j5wyr4

Bayesian Optimization for Policy Search in High-Dimensional Systems via Automatic Domain Selection [article]

Lukas P. Fröhlich, Edgar D. Klenske, Christian G. Daniel, Melanie N. Zeilinger
2020 arXiv   pre-print
The contributions of this paper are twofold: 1) We show how we can make use of a learned dynamics model in combination with a model-based controller to simplify the BO problem by focusing onto the most  ...  relevant regions of the optimization domain. 2) Based on (1) we present a method to find an embedding in parameter space that reduces the effective dimensionality of the optimization problem.  ...  an linear quadratic regulator (LQR) controller.  ... 
arXiv:2001.07394v1 fatcat:mhdal3cgvrfv5af2cxcicqfrbm

Combining Model-Based and Model-Free Methods for Nonlinear Control: A Provably Convergent Policy Gradient Approach [article]

Guannan Qu, Chenkai Yu, Steven Low, Adam Wierman
2020 arXiv   pre-print
We show this hybrid approach outperforms the model-based controller while avoiding the convergence issues associated with model-free approaches via both numerical experiments and theoretical analyses,  ...  We consider a dynamical system with both linear and non-linear components and develop a novel approach to use the linear model to define a warm start for a model-free, policy gradient method.  ...  Our work is mostly related to the class of model-free policy search methods for the Linear Quadratic Regulator (LQR), e.g. zeroth order policy search in Fazel et al. (2018); Malik et al. (2018); Bu et  ... 
arXiv:2006.07476v1 fatcat:mvwi3uqembh3dd4zawscq4rwja

Simple random search provides a competitive approach to reinforcement learning [article]

Horia Mania, Aurelia Guy, Benjamin Recht
2018 arXiv   pre-print
A common belief in model-free reinforcement learning is that methods based on random search in the parameter space of policies exhibit significantly worse sample complexity than those that explore the  ...  Computationally, our random search algorithm is at least 15 times more efficient than the fastest competing model-free methods on these benchmarks.  ...  BR is generously supported in part by NSF award CCF-1359814, ONR awards N00014-14-1-0024 and N00014-17-1-2191, the DARPA Fundamental Limits of Learning (Fun LoL) Program, and an Amazon AWS AI Research  ... 
arXiv:1803.07055v1 fatcat:luzazhf24nbj5co7q2x7fuvusi

Deep Reinforcement Learning Based Volt-VAR Optimization in Smart Distribution Systems [article]

Ying Zhang, Xinan Wang, Jianhui Wang, Yingchen Zhang
2020 arXiv   pre-print
This paper develops a model-free volt-VAR optimization (VVO) algorithm via multi-agent deep reinforcement learning (MADRL) in unbalanced distribution systems.  ...  Numerical simulations validate the excellent performance of this method in voltage regulation and power loss reduction.  ...  VVO Performance This section demonstrates the VVO performance of the proposed model-free MADRL method when facing random operating conditions, in terms of voltage regulation and power loss reduction, both  ... 
arXiv:2003.03681v2 fatcat:kaysoexl3rg7fjmpwdjwb5ki3q

Primal-dual Learning for the Model-free Risk-constrained Linear Quadratic Regulator [article]

Feiran Zhao, Keyou You
2021 arXiv   pre-print
In this work, we propose a model-free framework to learn a risk-aware controller with a focus on the linear system.  ...  Alongside, we find that the Lagrangian function enjoys an important local gradient dominance property, which is then exploited to develop a convergent random search algorithm to learn the dual function  ...  Random search for learning the linear quadratic regulator. In 2020 American Control Conference (ACC), pages 4798-4803. IEEE, 2020b. John B Moore, Robert J Elliott, and Subhrakanti Dey.  ... 
arXiv:2011.10931v4 fatcat:imhbx7cb2vfbfjxxp5aojgvmle

Bayesian Optimization for Policy Search in High-Dimensional Systems via Automatic Domain Selection

Lukas P. Frohlich, Edgar D. Klenske, Christian G. Daniel, Melanie N. Zeilinger
2019 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)  
The contributions of this paper are twofold: 1) We show how we can make use of a learned dynamics model in combination with a model-based controller to simplify the BO problem by focusing onto the most  ...  relevant regions of the optimization domain. 2) Based on (1) we present a method to find an embedding in parameter space that reduces the effective dimensionality of the optimization problem.  ...  an linear quadratic regulator (LQR) controller.  ... 
doi:10.1109/iros40897.2019.8967736 dblp:conf/iros/FrohlichKDZ19 fatcat:a6yyuapvdrgmlmrglomzjo6aue

Improving the Exploration of Deep Reinforcement Learning in Continuous Domains using Planning for Policy Search [article]

Jakob J. Hollenstein, Erwan Renaudo, Matteo Saveriano, Justus Piater
2020 arXiv   pre-print
We call the resulting model-based reinforcement learning method PPS (Planning for Policy Search).  ...  To better exploit simulation models in policy search, we propose to integrate a kinodynamic planner in the exploration strategy and to learn a control policy in an offline fashion from the generated environment  ...  The steering approach based on a linear MPC with shrinking horizon is summarized in Algorithm 4.  ... 
arXiv:2010.12974v1 fatcat:7yxryhlhifdnfahjobiy35z7zi

2020 Index IEEE Transactions on Automatic Control Vol. 65

2020 IEEE Transactions on Automatic Control  
Mizutani, E., TAC June 2020 2716-2723 A Simple Proof of Indefinite Linear-Quadratic Stochastic Optimal Control With Random Coefficients.  ...  Mizutani, E., TAC June 2020 2716-2723 A Simple Proof of Indefinite Linear-Quadratic Stochastic Optimal Control With Random Coefficients.  ...  Linear programming A Decentralized Event-Based Approach for Robust Model Predictive Control.  ... 
doi:10.1109/tac.2020.3046985 fatcat:hfiqhyr7sffqtewdmcwzsrugva

2020 Index IEEE Transactions on Cybernetics Vol. 50

2020 IEEE Transactions on Cybernetics  
Xiao, S., +, TCYB March 2020 1220-1229 Linear quadratic control Reinforcement Learning-Based Linear Quadratic Regulation of Continuous-Time Systems Using Dynamic Output Feedback.  ...  ., +, TCYB June 2020 2346-2356 Reinforcement Learning-Based Linear Quadratic Regulation of Continuous-Time Systems Using Dynamic Output Feedback.  ...  Stock markets A Quantum-Inspired Similarity Measure for the Analysis of Complete Weighted Graphs. Bai, L., +, TCYB March 2020 1264 -1277  ... 
doi:10.1109/tcyb.2020.3047216 fatcat:5giw32c2u5h23fu4drupnh644a

Curious iLQR: Resolving Uncertainty in Model-based RL [article]

Sarah Bechtle, Yixin Lin, Akshara Rai, Ludovic Righetti, Franziska Meier
2019 arXiv   pre-print
In this work, we propose a model-based reinforcement learning (MBRL) framework that combines Bayesian modeling of the system dynamics with curious iLQR, an iterative LQR approach that considers model uncertainty  ...  Our experiments show that MBRL with curious iLQR reaches desired end-effector targets more reliably and with less system rollouts when learning a new task from scratch, and that the learned model generalizes  ...  Acknowledgments The authors would like to thank Stefan Schaal for his advise throughout the project.  ... 
arXiv:1904.06786v2 fatcat:hglaqdinhzf3tchhydokno7jyy

Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning [article]

Yevgen Chebotar, Karol Hausman, Marvin Zhang, Gaurav Sukhatme, Stefan Schaal, Sergey Levine
2017 arXiv   pre-print
By focusing on time-varying linear-Gaussian policies, we enable a model-based algorithm based on the linear quadratic regulator (LQR) that can be integrated into the model-free framework of path integral  ...  These requirements are handled well by model-based and model-free RL approaches, respectively. In this work, we aim to combine the advantages of these two types of methods in a principled manner.  ...  Acknowledgements The authors would like to thank Sean Mason for his help with preparing the real robot experiments.  ... 
arXiv:1703.03078v3 fatcat:cricu37rongvlkkevecodhszse

Simple random search of static linear policies is competitive for reinforcement learning

Horia Mania, Aurelia Guy, Benjamin Recht
2018 Neural Information Processing Systems  
We introduce a model-free random search algorithm for training static, linear policies for continuous control problems.  ...  Model-free reinforcement learning aims to offer off-the-shelf solutions for controlling dynamical systems without requiring models of the system dynamics.  ...  BR is generously supported in part by NSF award CCF-1359814, ONR awards N00014-14-1-0024 and N00014-17-1-2191, the DARPA Fundamental Limits of Learning (Fun LoL) Program, and an Amazon AWS AI Research  ... 
dblp:conf/nips/ManiaGR18 fatcat:bxxefodppzhu7kwzwjtuy4yfdq
« Previous Showing results 1 — 15 out of 7,956 results