A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Adaptive Gradient Descent without Descent
[article]
2020
arXiv
pre-print
Yura Malitsky was supported by the ONRG project N62909-17-1-2111 and HASLER project N16066. ...
Yura Malitsky wishes to thank Roman Cheplyaka for his interest in optimization that partly inspired the current work. ...
arXiv:1910.09529v2
fatcat:kqbidyp6uracflsf6rz7jfpski
Revisiting Stochastic Extragradient
[article]
2020
arXiv
pre-print
We fix a fundamental issue in the stochastic extragradient method by providing a new sampling strategy that is motivated by approximating implicit updates. Since the existing stochastic extragradient algorithm, called Mirror-Prox, of (Juditsky et al., 2011) diverges on a simple bilinear problem when the domain is not bounded, we prove guarantees for solving variational inequality that go beyond existing settings. Furthermore, we illustrate numerically that the proposed variant converges faster
arXiv:1905.11373v2
fatcat:yx5barli3vbkvktiq74735b5ve
more »
... han many other methods on bilinear saddle-point problems. We also discuss how extragradient can be applied to training Generative Adversarial Networks (GANs) and how it compares to other methods. Our experiments on GANs demonstrate that the introduced approach may make the training faster in terms of data passes, while its higher iteration complexity makes the advantage smaller.
Golden Ratio Algorithms for Variational Inequalities
[article]
2019
arXiv
pre-print
The paper presents a fully explicit algorithm for monotone variational inequalities. The method uses variable stepsizes that are computed using two previous iterates as an approximation of the local Lipschitz constant without running a linesearch. Thus, each iteration of the method requires only one evaluation of a monotone operator F and a proximal mapping g. The operator F need not be Lipschitz-continuous, which also makes the algorithm interesting in the area of composite minimization where
arXiv:1803.08832v2
fatcat:hezsfjdetzcv5lbwh6bxei432y
more »
... ne cannot use the descent lemma. The method exhibits an ergodic O(1/k) convergence rate and R-linear rate, if F, g satisfy the error bound condition. We discuss possible applications of the method to fixed point problems. We discuss possible applications of the method to fixed point problems as well as its different generalizations.
Shadow Douglas--Rachford Splitting for Monotone Inclusions
[article]
2019
arXiv
pre-print
In this work, we propose a new algorithm for finding a zero in the sum of two monotone operators where one is assumed to be single-valued and Lipschitz continuous. This algorithm naturally arises from a non-standard discretization of a continuous dynamical system associated with the Douglas--Rachford splitting algorithm. More precisely, it is obtained by performing an explicit, rather than implicit, discretization with respect to one of the operators involved. Each iteration of the proposed
arXiv:1903.03393v1
fatcat:3w2r5sme3jek3pm32mtyuqkynu
more »
... rithm requires the evaluation of one forward and one backward operator.
A new regret analysis for Adam-type algorithms
[article]
2020
arXiv
pre-print
In this paper, we focus on a theory-practice gap for Adam and its variants (AMSgrad, AdamNC, etc.). In practice, these algorithms are used with a constant first-order moment parameter β_1 (typically between 0.9 and 0.99). In theory, regret guarantees for online convex optimization require a rapidly decaying β_1→0 schedule. We show that this is an artifact of the standard analysis and propose a novel framework that allows us to derive optimal, data-dependent regret bounds with a constant β_1,
arXiv:2003.09729v1
fatcat:3quackjgqzaangwagzqz6kbdeu
more »
... hout further assumptions. We also demonstrate the flexibility of our analysis on a wide range of different algorithms and settings.
Convergence of adaptive algorithms for weakly convex constrained optimization
[article]
2020
arXiv
pre-print
We analyze the adaptive first order algorithm AMSGrad, for solving a constrained stochastic optimization problem with a weakly convex objective. We prove the Õ(t^-1/4) rate of convergence for the norm of the gradient of Moreau envelope, which is the standard stationarity measure for this class of problems. It matches the known rates that adaptive algorithms enjoy for the specific case of unconstrained smooth stochastic optimization. Our analysis works with mini-batch size of 1, constant first
arXiv:2006.06650v1
fatcat:xvrkpm52f5gkbgglmpqtvbcpiy
more »
... d second order moment parameters, and possibly unbounded optimization domains. Finally, we illustrate the applications and extensions of our results to specific problems and algorithms.
Model Function Based Conditional Gradient Method with Armijo-like Line Search
[article]
2019
arXiv
pre-print
The Conditional Gradient Method is generalized to a class of non-smooth non-convex optimization problems with many applications in machine learning. The proposed algorithm iterates by minimizing so-called model functions over the constraint set. Complemented with an Amijo line search procedure, we prove that subsequences converge to a stationary point. The abstract framework of model functions provides great flexibility for the design of concrete algorithms. As special cases, for example, we
arXiv:1901.08087v1
fatcat:cc3z5hijyfcxzmdjgkeampiw6a
more »
... elop an algorithm for additive composite problems and an algorithm for non-linear composite problems which leads to a Gauss--Newton-type algorithm. Both instances are novel in non-smooth non-convex optimization and come with numerous applications in machine learning. Moreover, we obtain a hybrid version of Conditional Gradient and Proximal Minimization schemes for free, which combines advantages of both. Our algorithm is shown to perform favorably on a sparse non-linear robust regression problem and we discuss the flexibility of the proposed framework in several matrix factorization formulations.
Stochastic Variance Reduction for Variational Inequality Methods
[article]
2022
arXiv
pre-print
We propose stochastic variance reduced algorithms for solving convex-concave saddle point problems, monotone variational inequalities, and monotone inclusions. Our framework applies to extragradient, forward-backward-forward, and forward-reflected-backward methods both in Euclidean and Bregman setups. All proposed methods converge in the same setting as their deterministic counterparts and they either match or improve the best-known complexities for solving structured min-max problems. Our
arXiv:2102.08352v2
fatcat:mzekjor7ovahzatb2ft7xdvxhq
more »
... ts reinforce the correspondence between variance reduction in variational inequalities and minimization. We also illustrate the improvements of our approach with numerical evaluations on matrix games.
Block-coordinate primal-dual method for the nonsmooth minimization over linear constraints
[article]
2018
arXiv
pre-print
Russell Luke Institute for Numerical and Applied Mathematics, University of Göttingen, 37083 Göttingen, Germany, e-mail: r.luke@math.uni-goettingen.de
Yura Malitsky Institute for Numerical and Applied ...
arXiv:1801.04782v1
fatcat:jvokkvxauvb3fkbnawgjownfky
A first-order primal-dual method with adaptivity to local smoothness
[article]
2021
arXiv
pre-print
(c) of [Malitsky and Mishchenko, 2020] . ...
Malitsky [2020] proposes an algorithm for solving monotone VIs with a stepsize that adapts to local smoothness similarly to (3). ...
arXiv:2110.15148v1
fatcat:knhloajgwzgxxneyu65bcyt7uy
Forward-reflected-backward method with variance reduction
2021
Computational optimization and applications
AbstractWe propose a variance reduced algorithm for solving monotone variational inequalities. Without assuming strong monotonicity, cocoercivity, or boundedness of the domain, we prove almost sure convergence of the iterates generated by the algorithm to a solution. In the monotone case, the ergodic average converges with the optimal O(1/k) rate of convergence. When strong monotonicity is assumed, the algorithm converges linearly, without requiring the knowledge of strong monotonicity
doi:10.1007/s10589-021-00305-3
pmid:34720428
pmcid:PMC8550342
fatcat:dyctzkychrf2ndcc7axbrxml2q
more »
... We finalize with extensions and applications of our results to monotone inclusions, a class of non-monotone variational inequalities and Bregman projections.
Distributed Forward-Backward Methods for Ring Networks
[article]
2022
arXiv
pre-print
For the case when B 1 = • • • = B n−1 = 0, Malitsky & Tam Given λ ∈ (0, 2 L ) and γ ∈ 0, 1 − λL 2 and an initial point z 0 = (z 0 1 , . . . , z 0 n−1 ) ∈ H n−1 , our proposed algorithm for (7) generates ...
arXiv:2112.00274v2
fatcat:dswvtqzqgffj3ehl7exgrssfgi
Efficient, Quantitative Numerical Methods for Statistical Image Deconvolution and Denoising
[chapter]
2020
Topics in applied physics
We review the development of efficient numerical methods for statistical multi-resolution estimation of optical imaging experiments. In principle, this involves constrained linear deconvolution and denoising, and so these types of problems can be formulated as convex constrained, or even unconstrained, optimization. We address two main challenges: first of these is to quantify convergence of iterative algorithms; the second challenge is to develop efficient methods for these large-scale
doi:10.1007/978-3-030-34413-9_12
fatcat:e35fzehsyrdnld254ujjj33g2e
more »
... without sacrificing the quantification of convergence. We review the state of the art for these challenges. 2010 Mathematics Subject Classification: Primary 49J52 · 49M20 · 90C26 · Secondary 15A29 · 47H09 · 65K05 · 65K10 · 94A08.
Resolvent Splitting for Sums of Monotone Operators with Minimal Lifting
[article]
2022
arXiv
pre-print
In this work, we study fixed point algorithms for finding a zero in the sum of n≥ 2 maximally monotone operators by using their resolvents. More precisely, we consider the class of such algorithms where each resolvent is evaluated only once per iteration. For any algorithm from this class, we show that the underlying fixed point operator is necessarily defined on a d-fold Cartesian product space with d≥ n-1. Further, we show that this bound is unimprovable by providing a family of examples for
arXiv:2108.02897v2
fatcat:r43e65ic5bg77bulshlvbq67vm
more »
... hich d=n-1 is attained. This family includes the Douglas-Rachford algorithm as the special case when n=2. Applications of the new family of algorithms in distributed decentralised optimisation and multi-block extensions of the alternation direction method of multipliers (ADMM) are discussed.
A First-Order Primal-Dual Algorithm with Linesearch
2018
SIAM Journal on Optimization
Figure 3 : 3 Convergence plots for problem(38)
Codes can be found on https://gitlab.icg.tugraz.at/malitsky/primal_dual_linesearch
http://math.nist.gov/MatrixMarket/ ...
doi:10.1137/16m1092015
fatcat:vo5un7cy6facjc3u5wveka5eve
« Previous
Showing results 1 — 15 out of 39 results