### A hybrid semismooth quasi-Newton method for nonsmooth optimal control with PDEs

Florian Mannel, Armin Rund
2020 Optimization and Engineering
We propose a semismooth Newton-type method for nonsmooth optimal control problems. Its particular feature is the combination of a quasi-Newton method with a semismooth Newton method. This reduces the computational costs in comparison to semismooth Newton methods while maintaining local superlinear convergence. The method applies to Hilbert space problems whose objective is the sum of a smooth function, a regularization term, and a nonsmooth convex function. In the theoretical part of this work
more » ... part of this work we establish the local superlinear convergence of the method in an infinite-dimensional setting and discuss its application to sparse optimal control of the heat equation subject to box constraints. We verify that the assumptions for local superlinear convergence are satisfied in this application and we prove that convergence can take place in stronger norms than that of the Hilbert space if initial error and problem data permit. In the numerical part we provide a thorough study of the hybrid approach on two optimal control problems, including an engineering problem from magnetic resonance imaging that involves bilinear control of the Bloch equations. We use this problem to demonstrate that the new method is capable of solving nonconvex, nonsmooth large-scale real-world problems. Among others, the study addresses mesh independence, globalization techniques, and limited-memory methods. We observe throughout that algorithms based on the hybrid methodology are several times faster in runtime than their semismooth Newton counterparts. Keywords Semismooth Newton methods · Quasi-Newton methods · Superlinear convergence · Nonsmooth optimal control · Bloch equations Remark 1 Note that Prox ϕ γ is an operator from U to U , but is required to be semismooth from Q to U in 6). Note, furthermore, thatq ∈ Q holds in 6) due to (2). Remark 2 Under Assumption 1 there are constants L P , L ∇ > 0 such that are satisfied for all q close toq, respectively, for all u close toū. The constants L P and L ∇ will appear in the convergence results below. Remark 3 It would be enough to require 3)-5) only locally aroundū. Since f can be nonconvex and since ϕ can be nonsmooth, (P) is a nonconvex and nonsmooth optimization problem, in general. It may also feature a convex admissible set, as ϕ is extended real-valued. We tackle (P) by reformulating its first order optimality condition as operator equation H (q) = 0. The approach to use Robinson's normal map Robinson (1992) for the reformulation is inspired by (Pieper 2015, Section 3), which is one of the rather few references that we are aware of where a prox-based reformulation of the optimality conditions is used in the context of infinite dimensional PDE-constrained optimal control. This approach is, however, quite common in finite dimensional optimization, in particular in connection with first order methods, cf., e.g., Beck (2017) and Parikh and Boyd (2014) . Also, let us point out that semismoothness of proximal maps is addressed in (Xiao et al. 2018, Section 3) and (Milzarek 2016, Section 3.3) for finite dimensions as well as in (Pieper 2015, Section 3.3) for infinite dimensions. Lemma 1 Let Assumption 1 hold. Thenū satisfies the necessary optimality condition 0 ∈ ∇ f (ū)+∂ϕ(ū) of (P) andq satisfies H (q) = 0, where H is given by (3). Moreover, for anyq ∈ Q with H (q) = 0 the pointû := Prox ϕ γ (q) satisfies 0 ∈ ∇ f (û) + ∂ϕ(û). If the objective in (P) is convex, then any suchû is a global solution of (P). Proof It is well-known that the local solutionū of (P) satisfies 0 ∈ ∇ f (ū) + ∂ϕ(ū). Sinceq = − 1 γ ∇f (ū) by definition, we obtain γ (q −ū) = −∇ f (ū) ∈ ∂ϕ(ū), hencē u = Prox ϕ γ (q) by (1). Inserting this intoq = − 1 γ ∇f (ū) implies H (q) = 0. Ifq ∈ Q with H (q) = 0 is given and we setû := Prox ϕ γ (q) ∈ U , then we have