A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Byzantine Fault Tolerance in Distributed Machine Learning : a Survey
[article]
2022
arXiv
pre-print
In this paper, we present a survey of recent works surrounding BFT in DML. Mainly in first-order optimization methods, especially Stochastic Gradient Descent (SGD). ...
Byzantine failures are still difficult to tackle due to their unrestricted nature; as a result, the possibility of generating arbitrary data. ...
As an advantage of this framework, its scalability and flexibility, and nearly linear complexity, and when it combines with any previous robust aggregation rule, DETOX improves its efficiency and robustness ...
arXiv:2205.02572v1
fatcat:h2hkcgz3w5cvrnro6whl2rpvby
SPF-GMKL
2012
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '12
However, the projected gradient descent GMKL optimizer is inefficient as the computation of the step size and a reasonably accurate objective function value or gradient direction are all expensive. ...
The lack of an efficient, general purpose optimizer capable of handling a wide range of formulations presents a significant challenge to those looking to take MKL out of the lab and into the real world ...
Acknowledgements We are grateful to Kamal Gupta and to the Computer Services Center at IIT Delhi. ...
doi:10.1145/2339530.2339648
dblp:conf/kdd/JainVV12
fatcat:wyo5acexjrdubnrmbunckhpjzq
Principal whitened gradient for information geometry
2008
Neural Networks
The optimization based on the principal whitened gradients demonstrates faster and more robust convergence in simulations on unsupervised learning with synthetic data and on discriminant analysis of breast ...
Second, removal of the minor components of gradients enhances the estimation of the Fisher information matrix and reduces the computational cost. ...
Acknowledgments This work is supported by the Academy of Finland in the projects Neural methods in information retrieval based on automatic content analysis and relevance feedback and Finnish Centre of ...
doi:10.1016/j.neunet.2007.12.016
pmid:18255260
fatcat:amsf3qn2yvhprfx4cp6uybvvma
Universal Adversarial Training
2020
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
robust models with only 2× the cost of natural training. ...
We study the efficient generation of universal adversarial perturbations, and also efficient methods for hardening networks to these attacks. ...
The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the ...
doi:10.1609/aaai.v34i04.6017
fatcat:ybmlkpc66jaepkj5q2bujk3vti
Robust Trust Region for Weakly Supervised Segmentation
[article]
2021
arXiv
pre-print
We propose a new robust trust region approach for regularized losses improving the state-of-the-art results. Our approach can be seen as a higher-order generalization of the classic chain rule. ...
However, many common priors require optimization stronger than gradient descent. Thus, such regularizers have limited applicability in deep learning. ...
We also thank Vladimir Kolmogorov for suggesting prior studies of the tightness of the Potts model relaxations. image ground truth PCE-GD (6) Grid-GD Dense-GD Grid-TR (16, 18) image ground truth PCE-GD ...
arXiv:2104.01948v2
fatcat:p7z3e6hd75ds7kqqb2tmoxqyhm
Convolutional Neural Network Based Multimodal Biometric Human Authentication using Face, Palm Veins and Fingerprint
2020
VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE
In propose system, multi layer Convolutional Neural Network (CNN) is applied to multimodal biometric human authentication using face, palm vein and fingerprints to increase the robustness of system. ...
The performance of the system is evaluated on the basis of % recognition accuracy, and it shows significant improvement over the unimodal-biometric system and existing multimodal systems. ...
Robustness of stochastic gradient descent and the effectiveness of batch gradient descent are combined in mini batch gradient method. ...
doi:10.35940/ijitee.c8467.019320
fatcat:myivrdtwavbyjeyqlkkssofcwa
TAdam: A Robust Stochastic Gradient Optimizer
[article]
2020
arXiv
pre-print
We therefore propose a new stochastic gradient optimization method, whose robustness is directly built in the algorithm, using the robust student-t distribution as its core idea. ...
Adam, the popular optimization method, is modified with our method and the resultant optimizer, so-called TAdam, is shown to effectively outperform Adam in terms of robustness against noise on diverse ...
INTRODUCTION The field of machine learning is undoubtedly dominated by first-order optimization methods based on the gradient descent algorithm and particularly [1] , its stochastic variant, the stochastic ...
arXiv:2003.00179v2
fatcat:h632ernfsra5pd4pajr2tfwtxi
Adaptive shot allocation for fast convergence in variational quantum algorithms
[article]
2021
arXiv
pre-print
Here we present a new stochastic gradient descent method using an adaptive number of shots at each step, called the global Coupled Adaptive Number of Shots (gCANS) method, which improves on prior art in ...
both the number of iterations as well as the number of shots required. ...
Lipschitz-Continuous Gradients An appropriate choice of the learning rate, α, is crucial for a well-behaved gradient descent algorithm. ...
arXiv:2108.10434v1
fatcat:wpuxujlef5b2zkd75rmwxbyksu
Training Efficiency and Robustness in Deep Learning
[article]
2021
arXiv
pre-print
In this thesis, we study approaches to improve the training efficiency and robustness of deep learning models. ...
We show that a redundancy-aware modification to the sampling of training data improves the training speed and develops an efficient method for detecting the diversity of training signal, namely, gradient ...
] , the game of Go [185] , and natural language translation [157] . ...
arXiv:2112.01423v1
fatcat:3yqco7htnjdbng4hx2ilkrnkaq
A Neural Network Based on Synchronized Pairs of Nano-Oscillators
[article]
2017
arXiv
pre-print
With the end of CMOS scaling and increasing demand for efficient neural networks, alternative architectures implementing neural functions efficiently are being studied. ...
These results open the way for the design of alternative architectures adapted to efficient neural network execution. ...
nano-oscillators [13] , an exponential-decay peak detector, and the standard gradient descent learning algorithm. ...
arXiv:1709.02274v1
fatcat:4u5bi2ntdfgwfmb4knsiua4c6e
Universal Adversarial Training
[article]
2019
arXiv
pre-print
only 2X the cost of natural training. ...
We study the efficient generation of universal adversarial perturbations, and also efficient methods for hardening networks to these attacks. ...
Projected Gradient Descent (PGD) iteratively applies FGSM multiple times, and is one of the strongest per-instance attacks (Athalye, Carlini, and Wagner, 2018) . ...
arXiv:1811.11304v2
fatcat:okcbq7w2lfcnvkmxxgpxadz3gy
Approximated Geodesic Updates with Principal Natural Gradients
2007
Neural Networks (IJCNN), International Joint Conference on
The proposed method demonstrates faster and more robust convergence in the simulations on recovering a Gaussian mixture of artificial data and on discriminative learning of ionosphere data. ...
Second, we prove that dimensionality reduction of natural gradients is necessary for learning multidimensional linear transformations. ...
Calculating the learning direction with only principal components of the natural gradients thus enhances both efficiency and robustness. ...
doi:10.1109/ijcnn.2007.4371149
dblp:conf/ijcnn/YangL07
fatcat:r6bbfwsz7zcixac5fzzefm3kxu
Online Learning Rate Adaptation with Hypergradient Descent
[article]
2018
arXiv
pre-print
Our method works by dynamically updating the learning rate during optimization using the gradient with respect to the learning rate of the update rule itself. ...
We demonstrate the effectiveness of the method in a range of optimization problems by applying it to stochastic gradient descent, stochastic gradient descent with Nesterov momentum, and Adam, showing that ...
INTRODUCTION In nearly all gradient descent algorithms the choice of learning rate remains central to efficiency; Bengio (2012) asserts that it is "often the single most important hyper-parameter" and ...
arXiv:1703.04782v3
fatcat:bh7nxmrxfnfdrag4fn2pxlapfa
Robust Estimation of Natural Gradient in Optimization by Regularized Linear Regression
[chapter]
2013
Lecture Notes in Computer Science
The correspondence between the estimation of the natural gradient and solving a linear regression problem leads to the definition of regularized versions of the natural gradient. ...
We propose a robust estimation of the natural gradient for the exponential family based on regularized least squares. ...
of the natural gradient in steepest descent blackbox model-based search. ...
doi:10.1007/978-3-642-40020-9_97
fatcat:xqg2dsmjcjaltpimx4i5h46xva
SpikePropamine: Differentiable Plasticity in Spiking Neural Networks
2021
Frontiers in Neurorobotics
gradient descent. ...
Here, we introduce a framework for simultaneously learning the underlying fixed-weights and the rules governing the dynamics of synaptic plasticity and neuromodulated synaptic plasticity in SNNs through ...
Here, the local variable η (l) acts as a free parameter and is learned through gradient descent. ...
doi:10.3389/fnbot.2021.629210
pmid:34630063
pmcid:PMC8493296
fatcat:5njutoheojau3idveemwyfnvoe
« Previous
Showing results 1 — 15 out of 21,502 results