Deep learning as optimal control problems: Models and numerical methods

Martin Benning, ,School of Mathematical Sciences, Queen Mary University of London, London E1 4NS, UK, Elena Celledoni, Matthias J. Ehrhardt, Brynjulf Owren, Carola-Bibiane Schönlieb, ,Department of Mathematical Sciences, NTNU, 7491 Trondheim, Norway, ,Institute for Mathematical Innovation, University of Bath, Bath BA2 7JU, UK, ,Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge CB3 0WA, UK
2019 Journal of Computational Dynamics  
We consider recent work of [18] and [9] , where deep learning neural networks have been interpreted as discretisations of an optimal control problem subject to an ordinary differential equation constraint. We review the first order conditions for optimality, and the conditions ensuring optimality after discretisation. This leads to a class of algorithms for solving the discrete optimal control problem which guarantee that the corresponding discrete necessary conditions for optimality are
more » ... ed. The differential equation setting lends itself to learning additional parameters such as the time discretisation. We explore this extension alongside natural constraints (e.g. time steps lie in a simplex). We compare these deep learning algorithms numerically in terms of induced flow and generalisation ability.
doi:10.3934/jcd.2019009 fatcat:u7k3rr5qijec3nr6xbvo66vmta