Techniques in Computational Stochastic Dynamic Programming [chapter]

Floyd B. Hanson
1996 Control and Dynamic Systems  
Computational Stochastic Dynamic Programming, F. B. Hanson The finite element method has computational and memory advantages. This method requires a smaller number of nodes than the corresponding finite difference method of similar accuracy. We have shown [20] that the finite element method not only helps to alleviate Bellman's Curse of Dimensionality in dynamical programming computations by permitting the solution of higher dimension problems, but also saving supercomputer storage
more » ... The general aim is to develop fast and efficient parallel computational algorithms and data structures for optimal feedback control of large scale, continuous time, nonlinear, stochastic dynamical systems. Since the finite element procedure requires formulation of the mesh data structures, it is desirable to study the mapping from the problem conceptual structure to the machine configuration for either Cray or Connection Machine computational models [116] . However, the computational treatment of Poisson noise is a particularly unique feature of this chapter. The numerical approach directly treats the partial differential equation of stochastic dynamic programming. Results give the optimal feedback control variables and the expected optimal performance index in terms of state variables and time. For the stochastic optimal control problem, Monte Carlo and other simulations using random number generation are a primary alternative for direct dynamic programming computations, but disadvantages result from determining the sufficient sample size (complicated for general problems) and there is a question of maintaining feedback control. Furthermore, for simulation calculations, very complicated Markov processes have to be randomly generated and a tremendous number of sample trajectories would have to be averaged, whereas in the stochastic dynamic programming approach the averaging over the stochastic processes is built into the equation of dynamic programming. Hence, there is a great need to develop the use of high performance computing techniques in stochastic dynamic programming for direct solutions of stochastic optimal control problems. The report of the panel on the Future Directions in Control Theory [42] confirms the need for advanced scientific computing, both parallelization and vectorization, in control problems. The National Computing Initiative [97] calls stochastic dynamic programming computationally demanding, but avoids the opportunity to classify it as a Grand Challenge along with other problems of similar computational demands as it should be classified. Applications of stochastic dynamic programming arise in many areas, such as aerospace dynamics, financial economics, resource management, robotics and power generation. Another main effort in this area, in addition to our own, has been in France, with Quadrat and his coworkers [1] at INRIA developing an expert system that produces a multitude of results for stochastic differential equations with Gaussian noise, provided that discounting is constant and the problem can be transformed to a stationary one. Dantas de Melo, Calvet and Garcia [13, 26] in France have used the Cray-2 multitasking for discrete time dynamic programming problems. Kushner and coworkers [75, 76, 77] have recently described many numerical approaches to stochastic control, with special emphasis on the well-developed Markov chain approximation method. Also, much theoretical progress has been made for using viscosity solutions [21, 108, 22] . Shoemaker [79, 16, 24, 25] and coworkers have applied several variants of the deterministic differential dynamic programming algorithm groundwater applications. Differential dynamic programming is a modification of dynamic programming based upon quadratic expansions in state and control differentials and was originally developed by Mayne [91] . Luus [87, 88] has developed a method for deterministic, high dimensional dynamic programming problems using grid size reduction in both state and control, or in just control alone, such that the method converges to optimal control and state trajectories as the region reduction iterations proceed. The author and his co-workers have been developing computational mathematics solutions for fairly general stochastic dynamic programming problems in continuous time using high performance computing techniques The presentation in this chapter is in the formal manner of classical applied mathematics in order to focus on the methods and their implementation. In Section II computational stochastic dynamic programming is discussed for continuous time problems and advanced techniques are discussed in Section III. In Section IV, the direct stochastic dynamic programming approach is compared in some detail with the algorithm models of differential dynamic programming and the Markov chain approximation. These methods are selected for comparison in some depth since that they are actively used to solve similar type optimal control problems, rather than present a broad survey without much depth. They are reformulated in such a way to facilitate comparison. In Section V, research directions are briefly mentioned. Computational Stochastic Dynamic Programming 3 The development of fast and efficient computational algorithms is the goal for larger dimension relatively general optimal feedback control of nonlinear dynamical systems perturbed by stochastic diffusion and Poisson jump processes. The diffusion processes represent the continuous, background component of the perturbations, such as that due to fluctuating population death rates, randomly varying winds and other background environmental noise. The Poisson processes represent the discontinuous, rare event processes, such as occasional mass mortalities, large random weather changes or other large environmental effects. The Poisson perturbations model the more disastrous disturbances and these disastrous disturbances are more important for many realistic models than the phenomena modeled by continuous but nonsmooth disturbances resulting from Markov diffusions. The treatment of Poisson noise is a major feature here. However, there has been much more research on Markov diffusions, and this in undoubtedly due to the fact that they are much easier to analyze than the discontinuous Poisson noise. Random deviations from deterministic results tend to occur in regions of high costs and possible failure, indicating the need for fast algorithms for large fluctuations. Our goal is that our results should be in a practical form suitable for applications. Our motivation for this research comes from bioeconomic modeling, but the procedures developed are applicable to a wide range of biological, physical, chemical, and engineering applications with a stochastic dynamical system governing the motion or growth of the system, and with a performance or cost function that needs to be optimized. Our applications so far have been primarily the optimal harvesting of fisheries resources. Athans et al. [6] analyze a flight dynamics application perturbed by Gaussian noise, but this application could be treated with the more general random noise described here to model more realistic test conditions. Quadrat and coworkers [1] have made applications to the control of electric power systems. One emphasis here is the use high performance computing techniques on a wider range of applications.
doi:10.1016/s0090-5267(96)80017-x fatcat:7mtk4gtcmfhippn72sjnnox5ke