Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator
[article]
Maryam Fazel, Rong Ge, Sham M. Kakade, Mehran Mesbahi
2019
arXiv
pre-print
Direct policy gradient methods for reinforcement learning and continuous control problems are a popular approach for a variety of reasons: 1) they are easy to implement without explicit knowledge of the ...
In contrast, system identification and model based planning in optimal control theory have a much more solid theoretical footing, where much is known with regards to their computational and statistical ...
K. thanks Emo Todorov, Aravind Rajeswaran, Kendall Lowrey, Sanjeev Arora, and Elad Hazan for helpful discussions. S. K. and M. F. also thank Ben Recht for helpful discussions. R. ...
arXiv:1801.05039v3
fatcat:hf7gpybbxnfkrbuhzladrgvkby