39 Hits in 1.4 sec

Adaptive Gradient Descent without Descent [article]

Yura Malitsky, Konstantin Mishchenko
2020 arXiv   pre-print
Yura Malitsky was supported by the ONRG project N62909-17-1-2111 and HASLER project N16066.  ...  Yura Malitsky wishes to thank Roman Cheplyaka for his interest in optimization that partly inspired the current work.  ... 
arXiv:1910.09529v2 fatcat:kqbidyp6uracflsf6rz7jfpski

Revisiting Stochastic Extragradient [article]

Konstantin Mishchenko, Dmitry Kovalev, Egor Shulgin, Peter Richtárik, Yura Malitsky
2020 arXiv   pre-print
We fix a fundamental issue in the stochastic extragradient method by providing a new sampling strategy that is motivated by approximating implicit updates. Since the existing stochastic extragradient algorithm, called Mirror-Prox, of (Juditsky et al., 2011) diverges on a simple bilinear problem when the domain is not bounded, we prove guarantees for solving variational inequality that go beyond existing settings. Furthermore, we illustrate numerically that the proposed variant converges faster
more » ... han many other methods on bilinear saddle-point problems. We also discuss how extragradient can be applied to training Generative Adversarial Networks (GANs) and how it compares to other methods. Our experiments on GANs demonstrate that the introduced approach may make the training faster in terms of data passes, while its higher iteration complexity makes the advantage smaller.
arXiv:1905.11373v2 fatcat:yx5barli3vbkvktiq74735b5ve

Golden Ratio Algorithms for Variational Inequalities [article]

Yura Malitsky
2019 arXiv   pre-print
The paper presents a fully explicit algorithm for monotone variational inequalities. The method uses variable stepsizes that are computed using two previous iterates as an approximation of the local Lipschitz constant without running a linesearch. Thus, each iteration of the method requires only one evaluation of a monotone operator F and a proximal mapping g. The operator F need not be Lipschitz-continuous, which also makes the algorithm interesting in the area of composite minimization where
more » ... ne cannot use the descent lemma. The method exhibits an ergodic O(1/k) convergence rate and R-linear rate, if F, g satisfy the error bound condition. We discuss possible applications of the method to fixed point problems. We discuss possible applications of the method to fixed point problems as well as its different generalizations.
arXiv:1803.08832v2 fatcat:hezsfjdetzcv5lbwh6bxei432y

Shadow Douglas--Rachford Splitting for Monotone Inclusions [article]

Ernö Robert Csetnek, Yura Malitsky, Matthew K. Tam
2019 arXiv   pre-print
In this work, we propose a new algorithm for finding a zero in the sum of two monotone operators where one is assumed to be single-valued and Lipschitz continuous. This algorithm naturally arises from a non-standard discretization of a continuous dynamical system associated with the Douglas--Rachford splitting algorithm. More precisely, it is obtained by performing an explicit, rather than implicit, discretization with respect to one of the operators involved. Each iteration of the proposed
more » ... rithm requires the evaluation of one forward and one backward operator.
arXiv:1903.03393v1 fatcat:3w2r5sme3jek3pm32mtyuqkynu

A new regret analysis for Adam-type algorithms [article]

Ahmet Alacaoglu, Yura Malitsky, Panayotis Mertikopoulos, Volkan Cevher
2020 arXiv   pre-print
In this paper, we focus on a theory-practice gap for Adam and its variants (AMSgrad, AdamNC, etc.). In practice, these algorithms are used with a constant first-order moment parameter β_1 (typically between 0.9 and 0.99). In theory, regret guarantees for online convex optimization require a rapidly decaying β_1→0 schedule. We show that this is an artifact of the standard analysis and propose a novel framework that allows us to derive optimal, data-dependent regret bounds with a constant β_1,
more » ... hout further assumptions. We also demonstrate the flexibility of our analysis on a wide range of different algorithms and settings.
arXiv:2003.09729v1 fatcat:3quackjgqzaangwagzqz6kbdeu

Convergence of adaptive algorithms for weakly convex constrained optimization [article]

Ahmet Alacaoglu, Yura Malitsky, Volkan Cevher
2020 arXiv   pre-print
We analyze the adaptive first order algorithm AMSGrad, for solving a constrained stochastic optimization problem with a weakly convex objective. We prove the Õ(t^-1/4) rate of convergence for the norm of the gradient of Moreau envelope, which is the standard stationarity measure for this class of problems. It matches the known rates that adaptive algorithms enjoy for the specific case of unconstrained smooth stochastic optimization. Our analysis works with mini-batch size of 1, constant first
more » ... d second order moment parameters, and possibly unbounded optimization domains. Finally, we illustrate the applications and extensions of our results to specific problems and algorithms.
arXiv:2006.06650v1 fatcat:xvrkpm52f5gkbgglmpqtvbcpiy

Model Function Based Conditional Gradient Method with Armijo-like Line Search [article]

Yura Malitsky, Peter Ochs
2019 arXiv   pre-print
The Conditional Gradient Method is generalized to a class of non-smooth non-convex optimization problems with many applications in machine learning. The proposed algorithm iterates by minimizing so-called model functions over the constraint set. Complemented with an Amijo line search procedure, we prove that subsequences converge to a stationary point. The abstract framework of model functions provides great flexibility for the design of concrete algorithms. As special cases, for example, we
more » ... elop an algorithm for additive composite problems and an algorithm for non-linear composite problems which leads to a Gauss--Newton-type algorithm. Both instances are novel in non-smooth non-convex optimization and come with numerous applications in machine learning. Moreover, we obtain a hybrid version of Conditional Gradient and Proximal Minimization schemes for free, which combines advantages of both. Our algorithm is shown to perform favorably on a sparse non-linear robust regression problem and we discuss the flexibility of the proposed framework in several matrix factorization formulations.
arXiv:1901.08087v1 fatcat:cc3z5hijyfcxzmdjgkeampiw6a

Stochastic Variance Reduction for Variational Inequality Methods [article]

Ahmet Alacaoglu, Yura Malitsky
2022 arXiv   pre-print
We propose stochastic variance reduced algorithms for solving convex-concave saddle point problems, monotone variational inequalities, and monotone inclusions. Our framework applies to extragradient, forward-backward-forward, and forward-reflected-backward methods both in Euclidean and Bregman setups. All proposed methods converge in the same setting as their deterministic counterparts and they either match or improve the best-known complexities for solving structured min-max problems. Our
more » ... ts reinforce the correspondence between variance reduction in variational inequalities and minimization. We also illustrate the improvements of our approach with numerical evaluations on matrix games.
arXiv:2102.08352v2 fatcat:mzekjor7ovahzatb2ft7xdvxhq

Block-coordinate primal-dual method for the nonsmooth minimization over linear constraints [article]

D. Russell Luke, Yura Malitsky
2018 arXiv   pre-print
Russell Luke Institute for Numerical and Applied Mathematics, University of Göttingen, 37083 Göttingen, Germany, e-mail: Yura Malitsky Institute for Numerical and Applied  ... 
arXiv:1801.04782v1 fatcat:jvokkvxauvb3fkbnawgjownfky

A first-order primal-dual method with adaptivity to local smoothness [article]

Maria-Luiza Vladarean, Yura Malitsky, Volkan Cevher
2021 arXiv   pre-print
(c) of [Malitsky and Mishchenko, 2020] .  ...  Malitsky [2020] proposes an algorithm for solving monotone VIs with a stepsize that adapts to local smoothness similarly to (3).  ... 
arXiv:2110.15148v1 fatcat:knhloajgwzgxxneyu65bcyt7uy

Forward-reflected-backward method with variance reduction

Ahmet Alacaoglu, Yura Malitsky, Volkan Cevher
2021 Computational optimization and applications  
AbstractWe propose a variance reduced algorithm for solving monotone variational inequalities. Without assuming strong monotonicity, cocoercivity, or boundedness of the domain, we prove almost sure convergence of the iterates generated by the algorithm to a solution. In the monotone case, the ergodic average converges with the optimal O(1/k) rate of convergence. When strong monotonicity is assumed, the algorithm converges linearly, without requiring the knowledge of strong monotonicity
more » ... We finalize with extensions and applications of our results to monotone inclusions, a class of non-monotone variational inequalities and Bregman projections.
doi:10.1007/s10589-021-00305-3 pmid:34720428 pmcid:PMC8550342 fatcat:dyctzkychrf2ndcc7axbrxml2q

Distributed Forward-Backward Methods for Ring Networks [article]

Francisco J. Aragón-Artacho, Yura Malitsky, Matthew K. Tam, David Torregrosa-Belén
2022 arXiv   pre-print
For the case when B 1 = • • • = B n−1 = 0, Malitsky & Tam Given λ ∈ (0, 2 L ) and γ ∈ 0, 1 − λL 2 and an initial point z 0 = (z 0 1 , . . . , z 0 n−1 ) ∈ H n−1 , our proposed algorithm for (7) generates  ... 
arXiv:2112.00274v2 fatcat:dswvtqzqgffj3ehl7exgrssfgi

Efficient, Quantitative Numerical Methods for Statistical Image Deconvolution and Denoising [chapter]

D. Russell Luke, C. Charitha, Ron Shefi, Yura Malitsky
2020 Topics in applied physics  
We review the development of efficient numerical methods for statistical multi-resolution estimation of optical imaging experiments. In principle, this involves constrained linear deconvolution and denoising, and so these types of problems can be formulated as convex constrained, or even unconstrained, optimization. We address two main challenges: first of these is to quantify convergence of iterative algorithms; the second challenge is to develop efficient methods for these large-scale
more » ... without sacrificing the quantification of convergence. We review the state of the art for these challenges. 2010 Mathematics Subject Classification: Primary 49J52 · 49M20 · 90C26 · Secondary 15A29 · 47H09 · 65K05 · 65K10 · 94A08.
doi:10.1007/978-3-030-34413-9_12 fatcat:e35fzehsyrdnld254ujjj33g2e

Resolvent Splitting for Sums of Monotone Operators with Minimal Lifting [article]

Yura Malitsky, Matthew K. Tam
2022 arXiv   pre-print
In this work, we study fixed point algorithms for finding a zero in the sum of n≥ 2 maximally monotone operators by using their resolvents. More precisely, we consider the class of such algorithms where each resolvent is evaluated only once per iteration. For any algorithm from this class, we show that the underlying fixed point operator is necessarily defined on a d-fold Cartesian product space with d≥ n-1. Further, we show that this bound is unimprovable by providing a family of examples for
more » ... hich d=n-1 is attained. This family includes the Douglas-Rachford algorithm as the special case when n=2. Applications of the new family of algorithms in distributed decentralised optimisation and multi-block extensions of the alternation direction method of multipliers (ADMM) are discussed.
arXiv:2108.02897v2 fatcat:r43e65ic5bg77bulshlvbq67vm

A First-Order Primal-Dual Algorithm with Linesearch

Yura Malitsky, Thomas Pock
2018 SIAM Journal on Optimization  
Figure 3 : 3 Convergence plots for problem(38) Codes can be found on  ... 
doi:10.1137/16m1092015 fatcat:vo5un7cy6facjc3u5wveka5eve
« Previous Showing results 1 — 15 out of 39 results