A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Policy Optimization for H 2 Linear Control with H ∞ Robustness Guarantee: Implicit Regularization and Global Convergence
2020
Conference on Learning for Dynamics & Control
Policy optimization (PO) is a key ingredient for modern reinforcement learning (RL). For control design, certain constraints are usually enforced on the policies to optimize, accounting for stability, robustness, or safety concerns on the system. Hence, PO is by nature a constrained (nonconvex) optimization in most cases, whose global convergence is challenging to analyze in general. More importantly, some constraints that are safety-critical, e.g., the closed-loop stability, or the H ∞ -norm
dblp:conf/l4dc/ZhangHB20
fatcat:lj2sltqdgvedfi6okypcuec3m4