EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization

Wei Shi, Qing Ling, Gang Wu, Wotao Yin
2015 SIAM Journal on Optimization  
Recently, there has been growing interest in solving consensus optimization problems in a multiagent network. In this paper, we develop a decentralized algorithm for the consensus optimization problem minimize x∈R pf (x) = 1 n n i=1 f i (x), which is defined over a connected network of n agents, where each function f i is held privately by agent i and encodes the agent's data and objective. All the agents shall collaboratively find the minimizer while each agent can only communicate with its
more » ... ghbors. Such a computation scheme avoids a data fusion center or long-distance communication and offers better load balance to the network. This paper proposes a novel decentralized exact first-order algorithm (abbreviated as EXTRA) to solve the consensus optimization problem. "Exact" means that it can converge to the exact solution. EXTRA uses a fixed, large step size, which can be determined independently of the network size or topology. The local variable of every agent i converges uniformly and consensually to an exact minimizer off . In contrast, the well-known decentralized gradient descent (DGD) method must use diminishing step sizes in order to converge to an exact minimizer. EXTRA and DGD have the same choice of mixing matrices and similar periteration complexity. EXTRA, however, uses the gradients of the last two iterates, unlike DGD which uses just that of the last iterate. EXTRA has the best known convergence rates among the existing synchronized first-order decentralized algorithms for minimizing convex Lipschitz-differentiable functions. Specifically, if the f i 's are convex and have Lipschitz continuous gradients, EXTRA has an ergodic convergence rate O( 1 k ) in terms of the first-order optimality residual. In addition, as long asf is (restricted) strongly convex (not all individual f i 's need to be so), EXTRA converges to an optimal solution at a linear rate O(C −k ) for some constant C > 1.
doi:10.1137/14096668x fatcat:v6xn23eh2fesxmlr7u6mus2rxe