Convex nondifferentiable optimization: A survey focused on the analytic center cutting plane method

Jean-Louis Goffin, Jean-Philippe Vial
2002 Optimization Methods and Software  
We present a survey of nondi erentiable optimization problems and methods with special focus on the analytic center cutting plane method. We propose a self-contained convergence analysis, that uses the formalism of the theory of self-concordant functions, but for the main results, we give direct proofs based on the properties of the logarithmic function. We also provide an in depth analysis of two extensions that are very relevant to practical problems: the case of multiple cuts and the case of
more » ... deep cuts. We further examine extensions to problems including feasible sets partially described by an explicit barrier function, and to the case of nonlinear cuts. Finally, we review several implementation issues and discuss some applications. . 1 1. Introduction. In a famous example, Wolfe 87] showed that standard optimization methods fail to solve a simple convex nondi erentiable optimization problem. Since nondi erentiable problems often arise as the result of some mathematical transformation {e.g., Lagrangian relaxation or Benders decomposition{ of an originally large, and perhaps intractable, problem, there is a de nite need for e cient methods. This inspired and gave rise to a signi cant and sophisticated literature. Our goal is to review this eld, with a special focus on a cutting plane method based on the concept of analytical centers. Convex nondi erentiable optimization can be characterized by the nature of the information, which can be collected and used in the design of algorithms. Essentially, one has to replace di erential by subdi erential calculus. At a test point, one can compute subgradients of functions, i.e., supports to the epigraphs, and linear separators between the point and the feasible set. One often calls oracle the mechanism that computes the cutting planes. Given a sequence of trial points, the oracle produces cutting planes that provide a polyhedral outer approximation of the problem. The main issue in the design of algorithms is to make a clever use of this information to construct trial points. Many methods exist. Some use part of the information (e.g., the most recently generated cutting planes), some use all of it. We shall provide a brief survey of the main approaches. A linear approximation often turns out to be a very poor representation of underlying nonlinearities. Cutting plane algorithms face the risk of choosing trial points, which look attractive with respect to the polyhedral approximation, but turn out to be almost irrelevant for the original problem. Therefore, most methods propose a regularizing mechanism. Methods of centers exploit this idea: centers are less sensitive to the introduction of cutting planes than are the extreme points of the polyhedral approximation. Among many possible centers, the analytic center of a polytope has a number of advantages. The analytic center is an analytic function of the data describing the polytope, and thus its sensitivity to changes in this data is highly predictable. Furthermore, interior point methods o er highly e cient and robust methods to compute these centers. The analytic center cutting plane method (ACCPM) has been built on that principle. The theory underlying the method has been studied in depth by several authors, who provided complexity estimates for the basic method and for several of its enhancements. The method has been implemented; it has been also applied with success to a wide variety of problems. Our intent here is to collect many results and put them in a uni ed format to give the reader a general perspective on the theoretical and practical properties of the method. The paper is organized as follows. In Section 2, we review the best-known sources of nondi erentiable optimization problems. In Section 3, we present the most prominent methods for solving them. In Section 4, we analyze the basic version of ACCPM, in the context of a convex feasibility problem, endowed with an oracle producing one central cut at a time. The analytic center is associated with the logarithmic barrier, and the analysis relies very much on the properties of this function. The more general class of self-concordant functions shares many properties of the logarithmic function: it can be used to de ne more general analytic center cutting plane methods. Our presentation of the method does not apply this larger class. Although we focus on the logarithmic barrier, we used formulations that can be extended to self-concordant functions, making it easy to establish links with more recent developments. In Section 5, we review two extensions that are very relevant in practice: the case of multiple 2 2. Sources of nondi erentiable problems. 2.1. Lagrangian relaxation. The problem under investigation is min ff(x) j h(x) 0; x 2 Xg ; (2.1) where f is convex and h is a vector-valued convex function, while the set X is arbitrary, and may, for instance, include integrality restrictions. The standard Lagrangian is L(x; u) = f(x) + hu; h(x)i; where u is the vector of Lagrange multipliers or dual variables. The weak duality relationship min x2X max u 0 L(x; u) max u 0 min x2X L(x; u); holds under no assumptions (the convexity of f and g is not even necessary). The left-hand side of the equation is the optimal value of the original problem, that is: min x2X max u 0 L(x; u) = min ff(x) j h(x) 0; x 2 Xg : The dual function L(u) = min x2X L(x; u) is a concave nondi erentiable function, taking values in the extended reals. The dual problem max u 0 L(u) always provides a lower bound on the original problem. If X is convex and classical regularity conditions are satis ed, the optimal value of the dual problem equals the optimal value of the 3 original problem. If X is nite then L(u) is a piecewise linear function, but the number of pieces is likely to be exponential. The key observation is that if x 2 X is such that L( u) = L( x; u), then L(u) satis es the subgradient inequality L(u) L( u) + hh( x); u ? ui; 8u 0: The Lagrangian relaxation is attractive if the problem min x2X L(x; u) is easy to solve and u is of moderate size. A typical situation where this property holds is when X is a Cartesian product and L(x; u) is separable on this product space. The case where X is not convex, as, for instance, including integrality constraints, has led to a host of very successful applications. See, for instance, Held and Karp 43, 44] in their solution to the travelling salesman problem, Graves 42] in hierarchical planning, Geo rion 29], Shapiro 78] and Fisher 24]. 2.2. Dantzig-Wolfe column generation. Let f(x) = hc; xi and h(x) = Ax?b, and assume that X = fx : Dx dg. X can be represented by the convex hull of its extreme points fx k : k 2 Ig plus the conic hull of its extreme rays fx k : k 2 Jg. This allows us to express problem (2.1) as the following program: min
doi:10.1080/1055678021000060829a fatcat:mkuaxh7n5vguvcj6orrron44fm