Filters








857 Hits in 5.5 sec

Decentralized Composite Optimization with Compression [article]

Yao Li, Xiaorui Liu, Jiliang Tang, Ming Yan, Kun Yuan
2021 arXiv   pre-print
A Proximal gradient LinEAr convergent Decentralized algorithm with compression, Prox-LEAD, is proposed with rigorous theoretical analyses in the general stochastic setting and the finite-sum setting.  ...  While existing decentralized algorithms with communication compression mostly focus on the problems with only smooth components, we study the decentralized stochastic composite optimization problem with  ...  non-accelerated stochastic decentralized algorithms.  ... 
arXiv:2108.04448v2 fatcat:jckvtswsonht3frf7mlwipqk2u

Asynchronous Accelerated Proximal Stochastic Gradient for Strongly Convex Distributed Finite Sums [article]

Hadrien Hendrikx, Francis Bach, Laurent Massoulié
2019 arXiv   pre-print
We propose the decentralized and asynchronous algorithm ADFS to tackle the case when local functions are themselves finite sums with m components.  ...  This also leads to a √(m) speed-up over state-of-the-art distributed batch methods, which is the expected speed-up for finite sum algorithms.  ...  Stochastic algorithms for finite sums. So far, we have only presented batch methods that rely on computing full gradient steps of each function f i .  ... 
arXiv:1901.09865v3 fatcat:4y7q7dzlkjfk3h3qqp53ah3i64

An Optimal Algorithm for Decentralized Finite Sum Optimization [article]

Hadrien Hendrikx, Francis Bach, Laurent Massoulie
2020 arXiv   pre-print
In this work, we propose an efficient Accelerated Decentralized stochastic algorithm for Finite Sums named ADFS, which uses local stochastic proximal updates and decentralized communications between nodes  ...  For smooth and strongly convex problems, existing decentralized algorithms are slower than modern accelerated variance-reduced stochastic algorithms when run on a single machine, and are therefore not  ...  The main contribution of this paper is a locally synchronous Accelerated Decentralized stochastic algorithm for Finite Sums, named ADFS.  ... 
arXiv:2005.10675v1 fatcat:vilzwpwcsvbfvi3ktxep2yjvka

A general framework for decentralized optimization with first-order methods [article]

Ran Xin, Shi Pu, Angelia Nedić, Usman A. Khan
2020 arXiv   pre-print
Decentralized optimization to minimize a finite sum of functions over a network of nodes has been a significant focus within control and signal processing research due to its natural relevance to optimal  ...  We further extend the discussion to decentralized stochastic first-order methods that rely on stochastic gradients at each node and describe how local variance reduction schemes, previously shown to have  ...  Decentralized optimization is finite sum minimization formulated over a network of nodes.  ... 
arXiv:2009.05837v1 fatcat:4bqqtaskjvgkvitohhnyn65cym

Distributed stochastic proximal algorithm with random reshuffling for non-smooth finite-sum optimization [article]

Xia Jiang, Xianlin Zeng, Jian Sun, Jie Chen, Lihua Xie
2021 arXiv   pre-print
This paper develops a distributed stochastic proximal-gradient algorithm with random reshuffling to solve the finite-sum minimization over time-varying multi-agent networks.  ...  The non-smooth finite-sum minimization is a fundamental problem in machine learning.  ...  For smooth finite-sum minimization, [36] has studied decentralized stochastic gradient with shuffling and provided insights for practical data processing procedure.  ... 
arXiv:2111.03820v1 fatcat:je65n2crwnbsbnakucjgyinfsq

Decentralized and Parallel Primal and Dual Accelerated Methods for Stochastic Convex Programming Problems [article]

Darina Dvinskikh, Alexander Gasnikov
2021 arXiv   pre-print
We introduce primal and dual stochastic gradient oracle methods for decentralized convex optimization problems.  ...  The considered algorithms can be applied to many data science problems and inverse problems.  ...  In Section 6, we incorporate the proposed distributed decentralized method to get the optimal bounds for the finite-sum minimization problem using primal or dual oracle.  ... 
arXiv:1904.09015v17 fatcat:7j5ueplfsbcshfv75kd7nxndne

Dual-Free Stochastic Decentralized Optimization with Variance Reduction [article]

Hadrien Hendrikx, Francis Bach, Laurent Massoulié
2020 arXiv   pre-print
For finite-sum problems, fast single-machine algorithms for large datasets rely on stochastic updates combined with variance reduction.  ...  We give an accelerated version of DVR based on the Catalyst framework, and illustrate its effectiveness with simulations on real data.  ...  An accelerated decentralized stochastic proximal algorithm for finite sums. In Advances in Neural Information Processing Systems, 2019b. Hadrien Hendrikx, Francis Bach, and Laurent Massoulié.  ... 
arXiv:2006.14384v1 fatcat:kia2cagntrcilovwnfl2gsfnyu

A Stochastic Proximal Gradient Framework for Decentralized Non-Convex Composite Optimization: Topology-Independent Sample Complexity and Communication Efficiency [article]

Ran Xin, Subhro Das, Usman A. Khan, Soummya Kar
2021 arXiv   pre-print
Under this general formulation, we propose the first provably efficient, stochastic proximal gradient framework, called ProxGT.  ...  Decentralized optimization is a promising parallel computation paradigm for large-scale data analytics and machine learning problems defined over a network of nodes.  ...  finite-sum variance reduction methods [19, 37, 54] implemented on a single node.  ... 
arXiv:2110.01594v1 fatcat:gvws5pi4ibhglgeobim2scr46y

Decentralized Stochastic Proximal Gradient Descent with Variance Reduction over Time-varying Networks [article]

Xuanjie Li, Yuedong Xu, Jessie Hui Wang, Xin Wang, John C.S. Lui
2022 arXiv   pre-print
In decentralized learning, a network of nodes cooperate to minimize an overall objective function that is usually the finite-sum of their local objectives, and incorporates a non-smooth regularization  ...  In this paper, we propose a novel algorithm, namely DPSVRG, to accelerate the decentralized training by leveraging the variance reduction technique.  ...  Introduction Decentralized algorithms to solve finite-sum minimization problems are crucial to train machine learning models where data samples are distributed across a network of nodes.  ... 
arXiv:2112.10389v2 fatcat:un5kzj5e4rc2fe3wjyxfvfgncy

The Power of First-Order Smooth Optimization for Black-Box Non-Smooth Problems [article]

Alexander Gasnikov, Anton Novitskii, Vasilii Novitskii, Farshed Abdukhakimov, Dmitry Kamzolov, Aleksandr Beznosikov, Martin Takáč, Pavel Dvurechensky, Bin Gu
2022 arXiv   pre-print
We also elaborate on extensions for stochastic optimization problems, saddle-point problems, and distributed optimization.  ...  the convergence of our algorithms.  ...  Finite-sum problems As a special case of (10) with ξ uniformly distributed on 1, ..., m, we can consider the finite-sum (Empirical Risk Minimization) problem min x∈Q⊆R d f (x) := E ξ f (x, ξ) = 1 m m  ... 
arXiv:2201.12289v2 fatcat:reieaaymyjfaloqhac67jxsrhe

An introduction to decentralized stochastic optimization with gradient tracking [article]

Ran Xin and Soummya Kar and Usman A. Khan
2019 arXiv   pre-print
Decentralized solutions to finite-sum minimization are of significant importance in many signal processing, control, and machine learning applications.  ...  We provide intuitive illustrations of the main technical ideas as well as applications of the algorithms in the context of decentralized training of machine learning models.  ...  mapping [30] to each iteration of DSA, and ADFS [31] that applies an accelerated randomized proximal coordinate gradient method [32] to the dual formulation of Problem P3.  ... 
arXiv:1907.09648v2 fatcat:rizav6qrm5fh3ip63b6lg3tbse

Random gradient extrapolation for distributed and stochastic optimization [article]

Guanghui Lan, Yi Zhou
2017 arXiv   pre-print
Furthermore, we demonstrate that for stochastic finite-sum optimization problems, RGEM maintains the optimal ${\cal O}(1/\epsilon)$ complexity (up to a certain logarithmic factor) in terms of the number  ...  Moreover, our algorithms were developed based on a novel dual perspective of Nesterov's accelerated gradient method.  ...  RGEM for stochastic finite-sum optimization.  ... 
arXiv:1711.05762v1 fatcat:wrfsj4hasrctjifpqlduexbyhy

Gradient tracking and variance reduction for decentralized optimization and machine learning [article]

Ran Xin, Soummya Kar, Usman A. Khan
2020 arXiv   pre-print
Decentralized methods to solve finite-sum minimization problems are important in many signal processing and machine learning tasks where the data is distributed over a network of nodes and raw data sharing  ...  In this article, we review decentralized stochastic first-order methods and provide a unified algorithmic framework that combines variance-reduction with gradient tracking to achieve both robust performance  ...  and AVRG [31] , DSBA [32] that adds proximal mapping [33] to each iteration of DSA, ADFS [34] that applies an accelerated randomized proximal coordinate gradient method [35] to the dual formulation  ... 
arXiv:2002.05373v1 fatcat:vbpzknt3gfdxpahx72kbzmi5wm

Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters [article]

Pavel Dvurechensky, Darina Dvinskikh, Alexander Gasnikov, César A. Uribe, Angelia Nedić
2020 arXiv   pre-print
Motivated by this problem, we develop, and analyze, a novel accelerated primal-dual stochastic gradient method for general stochastic convex optimization problems with linear equality constraints.  ...  We study the decentralized distributed computation of discrete approximations for the regularized Wasserstein barycenter of a finite set of continuous probability measures distributedly stored over a network  ...  ) − ∇ϕ(λ k ) 2 * . (15) Accelerated primal-dual stochastic gradient method In this subsection, we develop an accelerated algorithm for the primal-dual pair of problems (P )−(D).  ... 
arXiv:1806.03915v3 fatcat:7mqdtl4vavefljzlmzmrcbk3ii

ANITA: An Optimal Loopless Accelerated Variance-Reduced Gradient Method [article]

Zhize Li
2021 arXiv   pre-print
In this paper, we propose a novel accelerated gradient method called ANITA for solving the fundamental finite-sum optimization problems.  ...  Besides, ANITA enjoys a simpler loopless algorithmic structure unlike previous accelerated algorithms such as Varag (Lan et al., 2019) and Katyusha (Allen-Zhu, 2017) where they use an inconvenient double-loop  ...  Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization. In Advances in Neural Information Processing Systems, pages 1145-1153, 2016b.  ... 
arXiv:2103.11333v2 fatcat:za22znychvgnpnfk64ajfutilm
« Previous Showing results 1 — 15 out of 857 results