Filters








784 Hits in 3.8 sec

An inexact subsampled proximal Newton-type method for large-scale machine learning [article]

Xuanqing Liu, Cho-Jui Hsieh, Jason D. Lee, Yuekai Sun
2017 arXiv   pre-print
As long as n > d, the proposed method is more efficient than state-of-the-art accelerated stochastic first-order methods for non-smooth regularizers which requires Õ(d(n + √(κ n))(1/ϵ)) FLOPS.  ...  We propose a fast proximal Newton-type algorithm for minimizing regularized finite sums that returns an ϵ-suboptimal point in Õ(d(n + √(κ d))(1/ϵ)) FLOPS, where n is number of samples, d is feature dimension  ...  Subsampled Proximal Newton-type methods The proposed method is, at its core, a proximal Newton-type method.  ... 
arXiv:1708.08552v1 fatcat:pr7nrwnc3jaybize2k2544ml3i

Do Subsampled Newton Methods Work for High-Dimensional Data?

Xiang Li, Shusen Wang, Zhihua Zhang
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Our theories work for three types of Newton methods: subsampled Netwon, distributed Newton, and proximal Newton.  ...  Subsampled Newton methods approximate Hessian matrices through subsampling techniques to alleviate the per-iteration cost.  ...  If r(·) = 0, then proximal Newton will be the same as the standard Newton's method.  ... 
doi:10.1609/aaai.v34i04.5905 fatcat:tk7idalcd5dpdjh2clv64yz7hi

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation [article]

Si Yi Meng, Sharan Vaswani, Issam Laradji, Mark Schmidt, Simon Lacoste-Julien
2020 arXiv   pre-print
Under this condition, we show that the regularized subsampled Newton method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size.  ...  By growing the batch size for both the subsampled gradient and Hessian, we show that R-SSN can converge at a quadratic rate in a local neighbourhood of the solution.  ...  Subsampled Newton Methods In this section, we present our main theoretical results for R-SSN.  ... 
arXiv:1910.04920v2 fatcat:jfzvxawxdrcp3ocfbnxmzh4fi4

Do Subsampled Newton Methods Work for High-Dimensional Data? [article]

Xiang Li, Shusen Wang, Zhihua Zhang
2019 arXiv   pre-print
This paper theoretically justifies the effectiveness of subsampled Newton methods on high dimensional data.  ...  Subsampled Newton methods approximate Hessian matrices through subsampling techniques, alleviating the cost of forming Hessian matrices but using sufficient curvature information.  ...  If r(•) = 0, then proximal Newton is the same as the standard Newton's method.  ... 
arXiv:1902.04952v2 fatcat:i3xezlqjpbfxtf72vk4j6rls4m

A quasi-Newton proximal splitting method [article]

Stephen Becker, M. Jalal Fadili
2012 arXiv   pre-print
The second part of the paper applies the previous result to acceleration of convex minimization problems, and leads to an elegant quasi-Newton method.  ...  The optimization method compares favorably against state-of-the-art alternatives.  ...  IPM requires solving a Newton-step equation, so first-order like "Hessian-free" variants of IPM solve the Newton-step approximately, either by approximately solving the equation or by subsampling the Hessian  ... 
arXiv:1206.1156v2 fatcat:vfqghdatqvdp5kdhvbncgbvkbu

Training L1-Regularized Models with Orthant-Wise Passive Descent Algorithms [article]

Jianqiao Wangni
2018 arXiv   pre-print
The quasi-Newton update can be utilized to incorporate curvature information and accelerate the speed.  ...  In this paper, we propose the orthant-wise passive descent algorithm (OPDA) for optimizing L_1-regularized models, as an improved substitute of proximal algorithms, which are the standard tools for optimizing  ...  We can also incorporate a block BFGS method (Gower, Goldfarb, and Richtárik 2016) to accelerate calculating the direction Hv.  ... 
arXiv:1704.07987v3 fatcat:bp5f4tmyubg53gvf5ilj3dx2tu

Learning to Exploit Proximal Force Sensing: A Comparison Approach [chapter]

Matteo Fumagalli, Arjan Gijsberts, Serena Ivaldi, Lorenzo Jamone, Giorgio Metta, Lorenzo Natale, Francesco Nori, Giulio Sandini
2010 Studies in Computational Intelligence  
evaluated on the normalized mean square error (NMSE) and the comparison is made with respect to the dimension of the training set, the information contained in the input space and, finally, using a Euclidean subsampling  ...  Learning to Exploit Proximal Force Sensing The chosen motors produce limited torques, which reflects into relatively low accelerations. Learning to Exploit Proximal Force Sensing  ...  In contrast, random subsampling performs better than Euclidean subsampling based on joint positions, velocities and accelerations.  ... 
doi:10.1007/978-3-642-05181-4_7 fatcat:dmlhj3o2ovge3ntmxdot5tkfc4

Nestrov's Acceleration For Second Order Method [article]

Haishan Ye, Zhihua Zhang
2017 arXiv   pre-print
In this paper, we resort to Nestrov's acceleration technique to improve the convergence performance of a class of second-order methods called approximate Newton.  ...  We give a theoretical analysis that Nestrov's acceleration technique can improve the convergence performance for approximate Newton just like for first-order methods.  ...  Algorithm 3 Accelerated Regularized Subsample Newton. 1: Input: x (0) , 0 < δ < 1, regularizer parameter α, sample size |S|, acceleration parameter θ (t) ; Let y (0) = x (0) 2: for t = 0, . . . until termination  ... 
arXiv:1705.07171v2 fatcat:cksdqnyi3jgt5eqyup6vms5ukq

Nesterov's Acceleration For Approximate Newton [article]

Haishan Ye, Zhihua Zhang
2017 arXiv   pre-print
In this paper, we resort to Nesterov's acceleration technique to improve the convergence performance of a class of second-order methods called approximate Newton.  ...  We give a theoretical analysis that Nesterov's acceleration technique can improve the convergence performance for approximate Newton just like for first-order methods.  ...  Newton method.  ... 
arXiv:1710.08496v1 fatcat:hdyml3xldfb3pap5brt7t5sdvm

New Proximal Newton-Type Methods for Convex Optimization [article]

Ilan Adler, Zhiyue Tom Hu, Tianyi Lin
2020 arXiv   pre-print
In this paper, we propose new proximal Newton-type methods for convex optimization problems in composite form. The applications include model predictive control (MPC) and embedded MPC.  ...  Experimental results on real-world datasets demonstrate the effectiveness and efficiency of new methods.  ...  method and [23] for analyzing inexact proximal Newton method.  ... 
arXiv:2007.09525v1 fatcat:ulfhoqlw3va6pbkcyirobzmfbi

Preface of the special issue dedicated to the XII Brazilian workshop on continuous optimization

Ernesto G. Birgin
2020 Computational optimization and applications  
The proposed approach accelerates the classical proximal point method for convex functions. R. Behling, J. Y. Bello Cruz, and L. R.  ...  Professor Martínez's contributions to Optimization include, but are not limited to, several theoretical and practical aspects of Quasi-Newton methods, Inexact Restoration methods, Spectral Projected Gradients  ... 
doi:10.1007/s10589-020-00203-0 fatcat:ly7txyiw5fed7lvi7oalvsm5za

Cubic Regularization with Momentum for Nonconvex Optimization [article]

Zhe Wang, Yi Zhou, Yingbin Liang, Guanghui Lan
2019 arXiv   pre-print
method and explore the potential for acceleration.  ...  Momentum is a popular technique to accelerate the convergence in practical training, and its impact on convergence guarantee has been well-studied for first-order algorithms.  ...  Stochastic variance-reduced cubic regularized Newton method. arXiv:1802.04796.  ... 
arXiv:1810.03763v2 fatcat:cyw6ycinu5bblnxel4ytqxc4om

Distributed Newton Methods for Deep Neural Networks [article]

Chien-Chih Wang, Kent Loong Tan, Chun-Ting Chen, Yu-Hsiang Lin, S. Sathiya Keerthi, Dhruv Mahajan, S. Sundararajan, Chih-Jen Lin
2018 arXiv   pre-print
Second, we consider subsampled Gauss-Newton matrices for reducing the running time as well as the communication cost.  ...  In this paper, we focus on situations where the model is distributedly stored, and propose a novel distributed Newton method for training deep neural networks.  ...  Under dense initialization, 13 they train their network with the Path-SGD method, which uses a proximal gradient method to solve the optimization problem.  ... 
arXiv:1802.00130v1 fatcat:zyxjrh2xszbhndcl7oidqycu64

The Practicality of Stochastic Optimization in Imaging Inverse Problems [article]

Junqi Tang, Karen Egiazarian, Mohammad Golbabaee, Mike Davies
2019 arXiv   pre-print
Finally, we propose an accelerated primal-dual SGD algorithm in order to tackle another key bottleneck of stochastic optimization which is the heavy computation of proximal operators.  ...  Surprisingly, in some tasks such as image deblurring, many of such methods fail to converge faster than the accelerated deterministic gradient methods, even in terms of epoch counts.  ...  Newton-steps) and solve these subproblems with deterministic or stochastic proximal gradient methods.  ... 
arXiv:1910.10100v2 fatcat:sh2elim45vfx7bh6xizuzi6e3u

Inexact Proximal Cubic Regularized Newton Methods for Convex Optimization [article]

Chaobing Song, Ji Liu, Yong Jiang
2019 arXiv   pre-print
In this paper, we use Proximal Cubic regularized Newton Methods (PCNM) to optimize the sum of a smooth convex function and a non-smooth convex function, where we use inexact gradient and Hessian, and an  ...  We propose inexact variants of PCNM and accelerated PCNM respectively, and show that both variants can achieve the same convergence rate as in the exact case, provided that the errors in the inexact gradient  ...  If F (x) is σ 2 -strongly convex, The accelerated inexact proximal cubic regularized Newton method Algorithm 2 Accelerated inexact proximal cubic regularized Newton method (AIPCNM) 1: Input: x 0 ∈ R  ... 
arXiv:1902.02388v2 fatcat:bvgp77kydnhufhrkfortlaspxa
« Previous Showing results 1 — 15 out of 784 results