A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Iterative Pre-Conditioning to Expedite the Gradient-Descent Method
[article]
2020
arXiv
pre-print
In this paper, we propose an iterative pre-conditioning approach that can significantly attenuate the influence of the problem's conditioning on the convergence-speed of the gradient-descent method. ...
Specifically, the method requires a large number of iterations to converge to a solution if the optimization problem is ill-conditioned. ...
ACKNOWLEDGEMENTS This work is being carried out as a part of the Pipeline System Integrity Management Project, which is supported by the Petroleum Institute, Khalifa University of Science and Technology ...
arXiv:2003.07180v2
fatcat:kgt46uwwujcghmrz7osfanxwgm
Iterative Pre-Conditioning for Expediting the Gradient-Descent Method: The Distributed Linear Least-Squares Problem
[article]
2021
arXiv
pre-print
If the data points are ill-conditioned, the gradient-descent method may require a large number of iterations to converge. ...
We rigorously show that the resulting pre-conditioned gradient-descent method, with the proposed iterative pre-conditioning, achieves superlinear convergence when the least-squares problem has a unique ...
Acknowledgements This work is being carried out as a part of the Pipeline System Integrity Management Project, which is supported by the Petroleum Institute, Khalifa University of Science and Technology ...
arXiv:2008.02856v2
fatcat:qebraytmqvgkbo5lijvn5it2xe
Parameter Prediction Using Machine Learning in Robot-Assisted Finishing Process
2020
International Journal of Mechanical Engineering and Robotics Research
With an indirect monitoring method to continually monitor these parameters such as spindle speed, these occurrences can be minimized. ...
Here lies in the benefit of an integrated parameter prediction model, which is able to detect deviation from normal operation early, hence enabling the capability of delivering actionable insights in a ...
The model accuracy rates of neural network and gradient descent algorithms across 20 iterations are shown in figure 7. ...
doi:10.18178/ijmerr.9.3.435-440
fatcat:zjn2kikdiraipnvcjajotprlcy
Accelerating Deep Neural Network Training with Inconsistent Stochastic Gradient Descent
[article]
2017
arXiv
pre-print
SGD is the widely adopted method to train CNN. ...
To address this issue, we present Inconsistent Stochastic Gradient Descent (ISGD) to dynamically vary training effort according to learning statuses on batches. ...
We do not initiate Alg.2 until the first epoch to build up a reliable limit (line 22 the condition of iter > n). ...
arXiv:1603.05544v3
fatcat:x4rgg3dpkze5bj5irdkymwujn4
A Jacobian Free Deterministic Method for Solving Inverse Problems
[article]
2022
arXiv
pre-print
The foregoing numerical methods are described with respect to the development of the Optima software to solve inverse problems, which are reduced to non-linear least squares problems. ...
Moreover, a line search algorithm is also described that ensures that the Armijo conditions are satisfied and that convergence is assured, which makes the success of the approach insensitive to the initial ...
One could interpret the LMA approach as an interpolation between the Gauss-Newton method and the method of gradient descent. ...
arXiv:2203.04138v1
fatcat:hyjyryvwlzdlfaxta7trs4chbe
Feature Imitating Networks
[article]
2021
arXiv
pre-print
We conclude that FINs can help bridge the gap between domain experts and machine learning practitioners by enabling researchers to harness insights from feature-engineering to enhance the performance of ...
In this paper, we introduce a novel approach to neural learning: the Feature-Imitating-Network (FIN). ...
Training All models (both FIN and Baseline) were trained using early stopping and a simple gradient descent optimizer. ...
arXiv:2110.04831v2
fatcat:65t6d7td4jf7pfgc4kc5ebh34e
Deep curriculum learning optimization
2020
SN Computer Science
In addition to these traits, the framework can combine CL with several variants of gradient descent (GD) algorithms and has been used to generate efficient batch-specific or data-set specific strategies ...
method. ...
SGD produces the same performance as regular gradient descent when the learning rate is low. Another variant of gradient descent widely used in practice is Adam. ...
doi:10.1007/s42979-020-00251-7
fatcat:xzs7gigpizaejaezdkeodfrqqu
Transferring Domain Knowledge with an Adviser in Continuous Tasks
[article]
2021
arXiv
pre-print
Hence, we adapt the Deep Deterministic Policy Gradient (DDPG) algorithm to incorporate an adviser, which allows integrating domain knowledge in the form of pre-learned policies or pre-defined relationships ...
to enhance the agent's learning process. ...
The DDPG algorithm updates the policy in each iteration with approximated policy gradients which are derived from the gradients of the critic network's output with respect to the parameters of the actor ...
arXiv:2102.08029v1
fatcat:44k3ziahsra2rg6mxuzlatjuzy
Minibatch Processing in Spiking Neural Networks
[article]
2019
arXiv
pre-print
Machine learning practitioners and biological modelers alike may benefit from the drastically reduced simulation time and increased iteration speed this method enables. ...
Code to reproduce the benchmarks and experimental findings in this paper can be found at https://github.com/djsaunde/snn-minibatch. ...
Acknowledgements We would like to thank Sam Wenke, Jim Fleming, and Mike Qiu for their careful review of the manuscript. ...
arXiv:1909.02549v1
fatcat:hnd5fsycgfgzbjzk45tsddaomi
Mesh-based spherical deconvolution: A flexible approach to reconstruction of non-negative fiber orientation distributions
2010
NeuroImage
Recently proposed methods for recovery of fiber orientation via spherical deconvolution utilize a spherical harmonics framework and are susceptible to noise, yielding physically-invalid results even when ...
We show that the method is robust and reliable by reconstructing known crossing fiber anatomy in multiple subjects. ...
Acknowledgments The authors graciously acknowledge the support of the National Institutes of Health for the funding support via grants 5T32GM008042- ...
doi:10.1016/j.neuroimage.2010.02.060
pmid:20206705
pmcid:PMC2927199
fatcat:7m62rsikynd5nncjd4lczcxyiu
Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework
[article]
2021
arXiv
pre-print
The PDP distinguishes from existing methods by two novel techniques: first, we differentiate through Pontryagin's Maximum Principle, and this allows to obtain the analytical derivative of a trajectory ...
can be iteratively solved using standard control tools. ...
the gradient descent θ k+1 = θ k − η k dL dθ θ k with dL dθ θ k = ∂L ∂ξ ξ θ k ∂ξ θ ∂θ θ k + ∂L ∂θ θ k . (9) Here, k = 0, 1, · · · is the iteration index; dL dθ θ k is the gradient of the loss with respect ...
arXiv:1912.12970v5
fatcat:7sse7c3bvzf7tjyu2smubzh3ri
A general framework for nonlinear multigrid inversion
2005
IEEE Transactions on Image Processing
An application of our method to Bayesian optical diffusion tomography with a generalized Gaussian Markov random-field image prior model shows the potential for very large computational savings. ...
The method works by dynamically adjusting the cost functionals at different scales so that they are consistent with, and ultimately reduce, the finest scale cost functional. ...
Hence, common methods, such as pre-conditioned conjugate gradient and/or adjoint differentiation [63] , [64] can be employed at each grid resolution. ...
doi:10.1109/tip.2004.837555
pmid:15646877
fatcat:ank5crre4bcm5lba4pf6zdrwgm
Fast Discrete Distribution Clustering Using Wasserstein Barycenter with Sparse Support
[article]
2017
arXiv
pre-print
D2-clustering pursues the minimum total within-cluster variation for a set of discrete distributions subject to the Kantorovich-Wasserstein metric. ...
D2-clustering has a severe scalability issue, the bottleneck being the computation of a centroid distribution, called Wasserstein barycenter, that minimizes its sum of squared distances to the cluster ...
We also thank the reviewers and the associate editor for constructive comments and suggestions. ...
arXiv:1510.00012v4
fatcat:fxe6akdscnhxtkvwu2637mngcq
PANFIS++: A Generalized Approach to Evolving Learning
[article]
2017
arXiv
pre-print
This module allows to actively select data streams for the training process, thereby expediting execution time and enhancing generalization performance, 2) PANFIS++ is built upon an interval type-2 fuzzy ...
This is meant to tackle the temporal system dynamic. ...
The ZEDM is used to adjust the q-design factor of PANFIS++ and constitutes a generalized version of the gradient descent method [35] . ...
arXiv:1705.02476v1
fatcat:wkwjlafrkfdsvlf62rslqviqqe
Distributed Machine Learning for Wireless Communication Networks: Techniques, Architectures, and Applications
[article]
2020
arXiv
pre-print
There is a clear gap in the existing literature in that the DML techniques are yet to be systematically reviewed for their applicability to wireless systems. ...
We also discuss the potential adversarial attacks faced by DML applications, and describe state-of-the-art countermeasures to preserve privacy and security. ...
The existing studies on FL solely leverage the first-order gradient descent (GD), without taking into account past iterations for the gradient renewal, which can potentially speed up convergence. ...
arXiv:2012.01489v1
fatcat:pdauhq4xbbepvf26clhpqnc2ci
« Previous
Showing results 1 — 15 out of 828 results