Filters








1,366 Hits in 6.0 sec

On the Convergence of Stochastic Compositional Gradient Descent Ascent Method

Hongchang Gao, Xiaoqian Wang, Lei Luo, Xinghua Shi
2021 Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence   unpublished
In this paper, we develop a novel efficient stochastic compositional gradient descent ascent method for optimizing the compositional minimax problem.  ...  Moreover, we establish the theoretical convergence rate of our proposed method.  ...  Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence  ... 
doi:10.24963/ijcai.2021/329 fatcat:yrgk7fobvfaz5coiwwbit6zqmq

Min-Max Optimization without Gradients: Convergence and Applications to Black-Box Evasion and Poisoning Attacks

Sijia Liu, Songtao Lu, Xiangyi Chen, Yao Feng, Kaidi Xu, Abdullah Al-Dujaili, Mingyi Hong, Una-May O'Reilly
2020 International Conference on Machine Learning  
We present a principled optimization framework, integrating a zeroth-order (ZO) gradient estimator with an alternating projected stochastic gradient descent-ascent method, where the former only requires  ...  a small number of function queries and the later needs just one-step descent/ascent update.  ...  Acknowledgements This work was supported by the MIT-IBM Watson AI Lab research grant. M. Hong and X.  ... 
dblp:conf/icml/0001LCFXAHO20 fatcat:vxl2ujdvjbgv5goku5a53nt2mu

Page 6328 of Mathematical Reviews Vol. , Issue 91K [page]

1991 Mathematical Reviews  
The proof hinges on the technique of Lyapunov’s direct method for investigation of discrete processes developed by the author [Stochastic gradient methods for the solution of minimax problems, Part I (  ...  direction of descent.  ... 

Accelerated Stochastic Block Coordinate Descent with Optimal Sampling

Aston Zhang, Quanquan Gu
2016 Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16  
We study the composite minimization problem where the objective function is the sum of two convex functions: one is the sum of a finite number of strongly convex and smooth functions, and the other is  ...  We propose an accelerated stochastic block coordinate descent (ASBCD) algorithm, which incorporates the incrementally averaged partial derivative into the stochastic partial derivative and exploits optimal  ...  We would like to thank the anonymous reviewers for their helpful comments.  ... 
doi:10.1145/2939672.2939819 dblp:conf/kdd/ZhangG16 fatcat:sl3vfm4lsndsnhelknh54v6vwa

Decoupled Asynchronous Proximal Stochastic Gradient Descent with Variance Reduction [article]

Zhouyuan Huo, Bin Gu, Heng Huang
2016 arXiv   pre-print
However, it still suffers from slow convergence rate because of the variance of stochastic gradient is nonzero.  ...  Recently, decoupled asynchronous proximal stochastic gradient descent (DAP-SGD) is proposed to minimize a composite function.  ...  Because of the variance of stochastic gradient, we have to reduce the learning rate for SGD method to guarantee convergence.  ... 
arXiv:1609.06804v2 fatcat:v6woa635mjaunfxk6vlptrqece

Variance-Reduced Methods for Machine Learning [article]

Robert M. Gower, Mark Schmidt, Francis Bach, Peter Richtarik
2020 arXiv   pre-print
Stochastic optimization lies at the heart of machine learning, and its cornerstone is stochastic gradient descent (SGD), a method introduced over 60 years ago.  ...  These speedups underline the surge of interest in VR methods and the fast-growing body of work on this topic.  ...  In particular, Quanquan's recommendations for the non-convex section improved the organization of our Section G.  ... 
arXiv:2010.00892v1 fatcat:a6y55epyrfbqtm7znhj3jf5xfi

Projected Semi-Stochastic Gradient Descent Method with Mini-Batch Scheme under Weak Strong Convexity Assumption [article]

Jie Liu, Martin Takac
2017 arXiv   pre-print
We propose a projected semi-stochastic gradient descent method with mini-batch for improving both the theoretical complexity and practical performance of the general stochastic gradient descent method  ...  Our PS2GD preserves the low-cost per iteration and high optimization accuracy via stochastic gradient variance-reduced technique, and admits a simple parallel implementation with mini-batches.  ...  This research of Jie Liu and Martin Takáč was supported by National Science Foundation grant CCF-1618717. We would like to thank Ji Liu for his helpful suggestions on related works.  ... 
arXiv:1612.05356v3 fatcat:qulr4dl3rza2vgklrfqq4im5dy

Stochastic Optimization with Importance Sampling for Regularized Loss Minimization

Peilin Zhao, Tong Zhang
2015 International Conference on Machine Learning  
affects the convergence of the underlying optimization procedure.  ...  Uniform sampling of training data has been commonly used in traditional stochastic optimization algorithms such as Proximal Stochastic Mirror Descent (prox-SMD) and Proximal Stochastic Dual Coordinate  ...  Acknowledgments The research of Peilin Zhao and Tong Zhang is partially supported by NSF-IIS-1407939 and NSF-IIS 1250985.  ... 
dblp:conf/icml/ZhaoZ15 fatcat:pxp7niihfbab3ar53n6b3jzlim

Accelerated Proximal Alternating Gradient-Descent-Ascent for Nonconvex Minimax Machine Learning [article]

Ziyi Chen, Shaocong Ma, Yi Zhou
2022 arXiv   pre-print
Alternating gradient-descent-ascent (AltGDA) is an optimization algorithm that has been widely used for model training in various machine learning applications, which aims to solve a nonconvex minimax  ...  We demonstrate the effectiveness of our algorithm via an experiment on adversarial deep learning.  ...  A basic algorithm for solving the above minimax optimization problem is the gradient-descent-ascent (GDA), which simultaneously performs gradient descent update and gradient ascent update on the variables  ... 
arXiv:2112.11663v7 fatcat:bgkeeeofijadzajcum44pbadsa

AUC Maximization in the Era of Big Data and AI: A Survey

Tianbao Yang, Yiming Ying
2022 ACM Computing Surveys  
Area under the ROC curve, a.k.a. AUC, is a measure of choice for assessing the performance of a classifier for imbalanced data.  ...  However, to the best our knowledge there is no comprehensive survey of related works for AUC maximization. This paper aims to address the gap by reviewing the literature in the past two decades.  ...  They also analyze the convergence rates for multiple stochastic algorithms A, including stochastic gradient descent ascent, stochastic optimistic gradient descent ascent, stochastic primal-dual STORM updates  ... 
doi:10.1145/3554729 fatcat:qkynxjqm5jeejoljdfodqchvfe

Stochastic Dual Coordinate Ascent with Alternating Direction Method of Multipliers

Taiji Suzuki
2014 International Conference on Machine Learning  
Although the original ADMM is a batch method, the proposed method offers a stochastic update rule where each iteration requires only one or few sample observations.  ...  We propose a new stochastic dual coordinate ascent technique that can be applied to a wide range of regularized learning problems.  ...  Acknowledgement TS was partially supported by MEXT Kakenhi 25730013, and the Aihara Project, the FIRST program from JSPS, initiated by CSTP.  ... 
dblp:conf/icml/Suzuki14 fatcat:66ombqvbnrhptld3g55peoxtte

Stochastic Optimization with Importance Sampling [article]

Peilin Zhao, Tong Zhang
2015 arXiv   pre-print
Uniform sampling of training data has been commonly used in traditional stochastic optimization algorithms such as Proximal Stochastic Gradient Descent (prox-SGD) and Proximal Stochastic Dual Coordinate  ...  affects the convergence of the underlying optimization procedure.  ...  Related Work We review some related work on Proximal Stochastic Gradient Descent (including more general proximal stochastic mirror descent) and Proximal Stochastic Dual Coordinate Ascent.  ... 
arXiv:1401.2753v2 fatcat:ngpgjou4qvbtjesd4ruecsmn74

AUC Maximization in the Era of Big Data and AI: A Survey [article]

Tianbao Yang, Yiming Ying
2022 arXiv   pre-print
Area under the ROC curve, a.k.a. AUC, is a measure of choice for assessing the performance of a classifier for imbalanced data.  ...  However, to the best our knowledge there is no comprehensive survey of related works for AUC maximization. This paper aims to address the gap by reviewing the literature in the past two decades.  ...  ACKNOWLEDGMENTS We thank the editors and anonymous reviewers for their constructive comments. T.  ... 
arXiv:2203.15046v3 fatcat:eyl3gvqyk5am7a33qabauo2vie

Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems [article]

Luo Luo, Haishan Ye, Zhichao Huang, Tong Zhang
2020 arXiv   pre-print
This method achieves the best known stochastic gradient complexity of 𝒪(κ^3ε^-3), and its dependency on ε is optimal for this problem.  ...  In this paper, we propose a novel method called Stochastic Recursive gradiEnt Descent Ascent (SREDA), which estimates gradients more efficiently using variance reduction.  ...  Acknowledgments and Disclosure of Funding The authors would like to thank Min Tao and Jiahao Xie to point out that the first version of this paper on arXiv has a mistake in the original proof of Theorem  ... 
arXiv:2001.03724v2 fatcat:rg2yygen7fg5jg6udqb7ru2zji

Stochastic Primal-Dual Proximal ExtraGradient descent for compositely regularized optimization

Tianyi Lin, Linbo Qiao, Teng Zhang, Jiashi Feng, Bofeng Zhang
2018 Neurocomputing  
To address these issues, we propose a stochastic variant of extra-gradient type methods, namely Stochastic Primal-Dual Proximal ExtraGradient descent (SPDPEG), and analyze its convergence property for  ...  On the other hand, the calculation of the full gradient of the expectation in the objective is very expensive when the number of input data samples is considerably large.  ...  In this work, we propose a Stochastic Primal-Dual Proximal Extra-Gradient Descent (SPDPEG), which inherits the advantages of EGADM and stochastic methods.  ... 
doi:10.1016/j.neucom.2017.07.066 fatcat:vupm42zhv5g4to2jwsnbzjsrf4
« Previous Showing results 1 — 15 out of 1,366 results