On the Optimization Landscape of Dynamical Output Feedback Linear Quadratic Control release_pvrax7e3fvdrnpscu7mnglg5dm

by Jingliang Duan, Wenhan Cao, Yang Zheng, Lin Zhao

Published by arXiv.

2022  

Abstract

The optimization landscape of optimal control problems plays an important role in the convergence of many policy gradient methods. Unlike state-feedback Linear Quadratic Regulator (LQR), static output-feedback policies are typically insufficient to achieve good closed-loop control performance. We investigate the optimization landscape of linear quadratic control using dynamical output feedback policies, denoted as dynamical LQR (dLQR) in this paper. We first show that the dLQR cost varies with similarity transformations. We then derive an explicit form of the optimal similarity transformation for a given observable stabilizing controller. We further characterize the unique observable stationary point of dLQR. This provides an optimality certificate for policy gradient methods under mild assumptions. Finally, we discuss the differences and connections between dLQR and the canonical linear quadratic Gaussian (LQG) control. These results shed light on designing policy gradient algorithms for decision-making problems with partially observed information.
In text/plain format

Archived Files and Locations

application/pdf   2.7 MB
file_y4qmxxfwsnfdnansjtlljxr3fy
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   published
Date   2022-01-01
Version   1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 6a970dc9-1e7c-4b20-ae42-800ffce7b7fb
API URL: JSON