### Linearly Solvable Optimal Control [chapter]

K. Dvijotham, E. Todorov
2013 Reinforcement Learning and Approximate Dynamic Programming for Feedback Control
We summarize the recently-developed framework of linearly-solvable stochastic optimal control. Using an exponential transformation, the (Hamilton-Jacobi) Bellman equation for such problems can be made linear, giving rise to efficient numerical methods. Extensions to game theory are also possible and lead to linear Isaacs equations. The key restriction that makes a stochastic optimal control problem linearly-solvable is that the noise and the controls must act in the same subspace. Apart from
more » ... ng linearly solvable, problems in this class have a number of unique properties including: path-integral interpretation of the exponentiated value function; compositionality of optimal control laws; duality with Bayesian inference; trajectory-based Maximum Principle for stochastic control. Development of a general class of more easily solvable problems tends to accelerate progress -as linear systems theory has done. The new framework may have similar impact in fields where stochastic optimal control is relevant. Linearly Solvable Optimal Control. 1 2 LINEARLY SOLVABLE OPTIMAL CONTROL INTRODUCTION Optimal control is of interest in many fields of science and engineering [4, 21] , and is arguably at the core of robust-yet-efficient animal behavior [23, 26] . Apart from the fact that "optimal" tends to be good even when it is not exactly optimal, this approach to control engineering is appealing because one can in principle define a high-level cost function specifying the task goal, and leave the hard work of synthesizing a controller to numerical optimization software. This leads to better automation, especially when compared to the manual designs often used in engineering practice. Yet optimizing controllers for real-world tasks is very challenging even numerically, and the present book explores the state-of-the-art approaches to overcoming this challenge. One of the most productive lines of attack when it comes to solving hard problems is to identify restricted problem formulations that can be solved efficiently, and use these restricted formulations to approximate (perhaps iteratively) the harder problem. An example is the field of numerical optimization, where the only multivariate function we know how to optimize analytically is the quadratic -and so we model every other function as being locally quadratic. This is the key idea behind all second-order methods. The situation is similar in optimal control and control theory in general, where the only systems we truly understand are linear -and so we often approximate many other systems as being linear, either locally or globally. An example of an optimal control method relying on iterative linearizations of the dynamics (and quadratizations of the cost) is the iterative LQG method  . This general approach to solving hard problems relies on having restricted problem formulations that are computationally tractable. For too long, linear systems theory has remained pretty much the only item on the menu. Recently, we and others have developed a restricted class of stochastic optimal control problems that are linearlysolvable [14, 27] . The dynamics in such problems can be non-linear (and even non-smooth), the costs can be non-quadratic, and the noise can be non-Gaussian. Yet the problem reduces to solving a linear equation -which is a minimized and exponentially-transformed Bellman equation. To be sure, this is not nearly as tractable as an LQG problem, because the linear equation is question is a functional equation characterizing a scalar function (the exponent of the value function) over a highdimensional continuous state space. Nevertheless solving such problems is much easier computationally than solving generic optimal control problems. The key restriction that makes a stochastic optimal control problem linearlysolvable is that the noise and the controls are interchangeable, i.e. anything that the control law can accomplish could also happen by chance (however small the probability may be) and vice versa. The control cost associated with a given outcome is inversely related to the probability of the same outcome under the passive/uncontrolled dynamics. The form of this control cost is fixed, while the state cost can be arbitrary. Apart from being linearly-solvable, problems in this class have unique properties that enable specialized numerical algorithms. These can be summarized as follows: