828 Hits in 6.5 sec

Iterative Pre-Conditioning to Expedite the Gradient-Descent Method [article]

Kushal Chakrabarti, Nirupam Gupta, Nikhil Chopra
2020 arXiv   pre-print
In this paper, we propose an iterative pre-conditioning approach that can significantly attenuate the influence of the problem's conditioning on the convergence-speed of the gradient-descent method.  ...  Specifically, the method requires a large number of iterations to converge to a solution if the optimization problem is ill-conditioned.  ...  ACKNOWLEDGEMENTS This work is being carried out as a part of the Pipeline System Integrity Management Project, which is supported by the Petroleum Institute, Khalifa University of Science and Technology  ... 
arXiv:2003.07180v2 fatcat:kgt46uwwujcghmrz7osfanxwgm

Iterative Pre-Conditioning for Expediting the Gradient-Descent Method: The Distributed Linear Least-Squares Problem [article]

Kushal Chakrabarti, Nirupam Gupta, Nikhil Chopra
2021 arXiv   pre-print
If the data points are ill-conditioned, the gradient-descent method may require a large number of iterations to converge.  ...  We rigorously show that the resulting pre-conditioned gradient-descent method, with the proposed iterative pre-conditioning, achieves superlinear convergence when the least-squares problem has a unique  ...  Acknowledgements This work is being carried out as a part of the Pipeline System Integrity Management Project, which is supported by the Petroleum Institute, Khalifa University of Science and Technology  ... 
arXiv:2008.02856v2 fatcat:qebraytmqvgkbo5lijvn5it2xe

Parameter Prediction Using Machine Learning in Robot-Assisted Finishing Process

Bobby K Pappachan, Tegoeh Tjahjowidodo
2020 International Journal of Mechanical Engineering and Robotics Research  
With an indirect monitoring method to continually monitor these parameters such as spindle speed, these occurrences can be minimized.  ...  Here lies in the benefit of an integrated parameter prediction model, which is able to detect deviation from normal operation early, hence enabling the capability of delivering actionable insights in a  ...  The model accuracy rates of neural network and gradient descent algorithms across 20 iterations are shown in figure 7.  ... 
doi:10.18178/ijmerr.9.3.435-440 fatcat:zjn2kikdiraipnvcjajotprlcy

Accelerating Deep Neural Network Training with Inconsistent Stochastic Gradient Descent [article]

Linnan Wang, Yi Yang, Martin Renqiang Min, Srimat Chakradhar
2017 arXiv   pre-print
SGD is the widely adopted method to train CNN.  ...  To address this issue, we present Inconsistent Stochastic Gradient Descent (ISGD) to dynamically vary training effort according to learning statuses on batches.  ...  We do not initiate Alg.2 until the first epoch to build up a reliable limit (line 22 the condition of iter > n).  ... 
arXiv:1603.05544v3 fatcat:x4rgg3dpkze5bj5irdkymwujn4

A Jacobian Free Deterministic Method for Solving Inverse Problems [article]

M.H.A. Piro, J.S. Bell, M. Poschmann, A. Prudil, P. Chan
2022 arXiv   pre-print
The foregoing numerical methods are described with respect to the development of the Optima software to solve inverse problems, which are reduced to non-linear least squares problems.  ...  Moreover, a line search algorithm is also described that ensures that the Armijo conditions are satisfied and that convergence is assured, which makes the success of the approach insensitive to the initial  ...  One could interpret the LMA approach as an interpolation between the Gauss-Newton method and the method of gradient descent.  ... 
arXiv:2203.04138v1 fatcat:hyjyryvwlzdlfaxta7trs4chbe

Feature Imitating Networks [article]

Sari Saba-Sadiya, Tuka Alhanai, Mohammad M Ghassemi
2021 arXiv   pre-print
We conclude that FINs can help bridge the gap between domain experts and machine learning practitioners by enabling researchers to harness insights from feature-engineering to enhance the performance of  ...  In this paper, we introduce a novel approach to neural learning: the Feature-Imitating-Network (FIN).  ...  Training All models (both FIN and Baseline) were trained using early stopping and a simple gradient descent optimizer.  ... 
arXiv:2110.04831v2 fatcat:65t6d7td4jf7pfgc4kc5ebh34e

Deep curriculum learning optimization

Henok Ghebrechristos, Gita Alaghband
2020 SN Computer Science  
In addition to these traits, the framework can combine CL with several variants of gradient descent (GD) algorithms and has been used to generate efficient batch-specific or data-set specific strategies  ...  method.  ...  SGD produces the same performance as regular gradient descent when the learning rate is low. Another variant of gradient descent widely used in practice is Adam.  ... 
doi:10.1007/s42979-020-00251-7 fatcat:xzs7gigpizaejaezdkeodfrqqu

Transferring Domain Knowledge with an Adviser in Continuous Tasks [article]

Rukshan Wijesinghe, Kasun Vithanage, Dumindu Tissera, Alex Xavier, Subha Fernando, Jayathu Samarawickrama
2021 arXiv   pre-print
Hence, we adapt the Deep Deterministic Policy Gradient (DDPG) algorithm to incorporate an adviser, which allows integrating domain knowledge in the form of pre-learned policies or pre-defined relationships  ...  to enhance the agent's learning process.  ...  The DDPG algorithm updates the policy in each iteration with approximated policy gradients which are derived from the gradients of the critic network's output with respect to the parameters of the actor  ... 
arXiv:2102.08029v1 fatcat:44k3ziahsra2rg6mxuzlatjuzy

Minibatch Processing in Spiking Neural Networks [article]

Daniel J. Saunders, Cooper Sigrist, Kenneth Chaney, Robert Kozma, Hava T. Siegelmann
2019 arXiv   pre-print
Machine learning practitioners and biological modelers alike may benefit from the drastically reduced simulation time and increased iteration speed this method enables.  ...  Code to reproduce the benchmarks and experimental findings in this paper can be found at  ...  Acknowledgements We would like to thank Sam Wenke, Jim Fleming, and Mike Qiu for their careful review of the manuscript.  ... 
arXiv:1909.02549v1 fatcat:hnd5fsycgfgzbjzk45tsddaomi

Mesh-based spherical deconvolution: A flexible approach to reconstruction of non-negative fiber orientation distributions

Vishal Patel, Yonggang Shi, Paul M. Thompson, Arthur W. Toga
2010 NeuroImage  
Recently proposed methods for recovery of fiber orientation via spherical deconvolution utilize a spherical harmonics framework and are susceptible to noise, yielding physically-invalid results even when  ...  We show that the method is robust and reliable by reconstructing known crossing fiber anatomy in multiple subjects.  ...  Acknowledgments The authors graciously acknowledge the support of the National Institutes of Health for the funding support via grants 5T32GM008042-  ... 
doi:10.1016/j.neuroimage.2010.02.060 pmid:20206705 pmcid:PMC2927199 fatcat:7m62rsikynd5nncjd4lczcxyiu

Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework [article]

Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou
2021 arXiv   pre-print
The PDP distinguishes from existing methods by two novel techniques: first, we differentiate through Pontryagin's Maximum Principle, and this allows to obtain the analytical derivative of a trajectory  ...  can be iteratively solved using standard control tools.  ...  the gradient descent θ k+1 = θ k − η k dL dθ θ k with dL dθ θ k = ∂L ∂ξ ξ θ k ∂ξ θ ∂θ θ k + ∂L ∂θ θ k . (9) Here, k = 0, 1, · · · is the iteration index; dL dθ θ k is the gradient of the loss with respect  ... 
arXiv:1912.12970v5 fatcat:7sse7c3bvzf7tjyu2smubzh3ri

A general framework for nonlinear multigrid inversion

Seungseok Oh, A.B. Milstein, C.A. Bouman, K.J. Webb
2005 IEEE Transactions on Image Processing  
An application of our method to Bayesian optical diffusion tomography with a generalized Gaussian Markov random-field image prior model shows the potential for very large computational savings.  ...  The method works by dynamically adjusting the cost functionals at different scales so that they are consistent with, and ultimately reduce, the finest scale cost functional.  ...  Hence, common methods, such as pre-conditioned conjugate gradient and/or adjoint differentiation [63] , [64] can be employed at each grid resolution.  ... 
doi:10.1109/tip.2004.837555 pmid:15646877 fatcat:ank5crre4bcm5lba4pf6zdrwgm

Fast Discrete Distribution Clustering Using Wasserstein Barycenter with Sparse Support [article]

Jianbo Ye, Panruo Wu, James Z. Wang, Jia Li
2017 arXiv   pre-print
D2-clustering pursues the minimum total within-cluster variation for a set of discrete distributions subject to the Kantorovich-Wasserstein metric.  ...  D2-clustering has a severe scalability issue, the bottleneck being the computation of a centroid distribution, called Wasserstein barycenter, that minimizes its sum of squared distances to the cluster  ...  We also thank the reviewers and the associate editor for constructive comments and suggestions.  ... 
arXiv:1510.00012v4 fatcat:fxe6akdscnhxtkvwu2637mngcq

PANFIS++: A Generalized Approach to Evolving Learning [article]

Mahardhika Pratama
2017 arXiv   pre-print
This module allows to actively select data streams for the training process, thereby expediting execution time and enhancing generalization performance, 2) PANFIS++ is built upon an interval type-2 fuzzy  ...  This is meant to tackle the temporal system dynamic.  ...  The ZEDM is used to adjust the q-design factor of PANFIS++ and constitutes a generalized version of the gradient descent method [35] .  ... 
arXiv:1705.02476v1 fatcat:wkwjlafrkfdsvlf62rslqviqqe

Distributed Machine Learning for Wireless Communication Networks: Techniques, Architectures, and Applications [article]

S. Hu, X. Chen, W. Ni, E. Hossain, X. Wang
2020 arXiv   pre-print
There is a clear gap in the existing literature in that the DML techniques are yet to be systematically reviewed for their applicability to wireless systems.  ...  We also discuss the potential adversarial attacks faced by DML applications, and describe state-of-the-art countermeasures to preserve privacy and security.  ...  The existing studies on FL solely leverage the first-order gradient descent (GD), without taking into account past iterations for the gradient renewal, which can potentially speed up convergence.  ... 
arXiv:2012.01489v1 fatcat:pdauhq4xbbepvf26clhpqnc2ci
« Previous Showing results 1 — 15 out of 828 results