Filters








21,502 Hits in 5.2 sec

Byzantine Fault Tolerance in Distributed Machine Learning : a Survey [article]

Djamila Bouhata, Hamouma Moumen
2022 arXiv   pre-print
In this paper, we present a survey of recent works surrounding BFT in DML. Mainly in first-order optimization methods, especially Stochastic Gradient Descent (SGD).  ...  Byzantine failures are still difficult to tackle due to their unrestricted nature; as a result, the possibility of generating arbitrary data.  ...  As an advantage of this framework, its scalability and flexibility, and nearly linear complexity, and when it combines with any previous robust aggregation rule, DETOX improves its efficiency and robustness  ... 
arXiv:2205.02572v1 fatcat:h2hkcgz3w5cvrnro6whl2rpvby

SPF-GMKL

Ashesh Jain, S.V.N. Vishwanathan, Manik Varma
2012 Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '12  
However, the projected gradient descent GMKL optimizer is inefficient as the computation of the step size and a reasonably accurate objective function value or gradient direction are all expensive.  ...  The lack of an efficient, general purpose optimizer capable of handling a wide range of formulations presents a significant challenge to those looking to take MKL out of the lab and into the real world  ...  Acknowledgements We are grateful to Kamal Gupta and to the Computer Services Center at IIT Delhi.  ... 
doi:10.1145/2339530.2339648 dblp:conf/kdd/JainVV12 fatcat:wyo5acexjrdubnrmbunckhpjzq

Principal whitened gradient for information geometry

Zhirong Yang, Jorma Laaksonen
2008 Neural Networks  
The optimization based on the principal whitened gradients demonstrates faster and more robust convergence in simulations on unsupervised learning with synthetic data and on discriminant analysis of breast  ...  Second, removal of the minor components of gradients enhances the estimation of the Fisher information matrix and reduces the computational cost.  ...  Acknowledgments This work is supported by the Academy of Finland in the projects Neural methods in information retrieval based on automatic content analysis and relevance feedback and Finnish Centre of  ... 
doi:10.1016/j.neunet.2007.12.016 pmid:18255260 fatcat:amsf3qn2yvhprfx4cp6uybvvma

Universal Adversarial Training

Ali Shafahi, Mahyar Najibi, Zheng Xu, John Dickerson, Larry S. Davis, Tom Goldstein
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
robust models with only 2× the cost of natural training.  ...  We study the efficient generation of universal adversarial perturbations, and also efficient methods for hardening networks to these attacks.  ...  The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the  ... 
doi:10.1609/aaai.v34i04.6017 fatcat:ybmlkpc66jaepkj5q2bujk3vti

Robust Trust Region for Weakly Supervised Segmentation [article]

Dmitrii Marin, Yuri Boykov
2021 arXiv   pre-print
We propose a new robust trust region approach for regularized losses improving the state-of-the-art results. Our approach can be seen as a higher-order generalization of the classic chain rule.  ...  However, many common priors require optimization stronger than gradient descent. Thus, such regularizers have limited applicability in deep learning.  ...  We also thank Vladimir Kolmogorov for suggesting prior studies of the tightness of the Potts model relaxations. image ground truth PCE-GD (6) Grid-GD Dense-GD Grid-TR (16, 18) image ground truth PCE-GD  ... 
arXiv:2104.01948v2 fatcat:p7z3e6hd75ds7kqqb2tmoxqyhm

Convolutional Neural Network Based Multimodal Biometric Human Authentication using Face, Palm Veins and Fingerprint

2020 VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE  
In propose system, multi layer Convolutional Neural Network (CNN) is applied to multimodal biometric human authentication using face, palm vein and fingerprints to increase the robustness of system.  ...  The performance of the system is evaluated on the basis of % recognition accuracy, and it shows significant improvement over the unimodal-biometric system and existing multimodal systems.  ...  Robustness of stochastic gradient descent and the effectiveness of batch gradient descent are combined in mini batch gradient method.  ... 
doi:10.35940/ijitee.c8467.019320 fatcat:myivrdtwavbyjeyqlkkssofcwa

TAdam: A Robust Stochastic Gradient Optimizer [article]

Wendyam Eric Lionel Ilboudo, Taisuke Kobayashi, Kenji Sugimoto
2020 arXiv   pre-print
We therefore propose a new stochastic gradient optimization method, whose robustness is directly built in the algorithm, using the robust student-t distribution as its core idea.  ...  Adam, the popular optimization method, is modified with our method and the resultant optimizer, so-called TAdam, is shown to effectively outperform Adam in terms of robustness against noise on diverse  ...  INTRODUCTION The field of machine learning is undoubtedly dominated by first-order optimization methods based on the gradient descent algorithm and particularly [1] , its stochastic variant, the stochastic  ... 
arXiv:2003.00179v2 fatcat:h632ernfsra5pd4pajr2tfwtxi

Adaptive shot allocation for fast convergence in variational quantum algorithms [article]

Andi Gu, Angus Lowe, Pavel A. Dub, Patrick J. Coles, Andrew Arrasmith
2021 arXiv   pre-print
Here we present a new stochastic gradient descent method using an adaptive number of shots at each step, called the global Coupled Adaptive Number of Shots (gCANS) method, which improves on prior art in  ...  both the number of iterations as well as the number of shots required.  ...  Lipschitz-Continuous Gradients An appropriate choice of the learning rate, α, is crucial for a well-behaved gradient descent algorithm.  ... 
arXiv:2108.10434v1 fatcat:wpuxujlef5b2zkd75rmwxbyksu

Training Efficiency and Robustness in Deep Learning [article]

Fartash Faghri
2021 arXiv   pre-print
In this thesis, we study approaches to improve the training efficiency and robustness of deep learning models.  ...  We show that a redundancy-aware modification to the sampling of training data improves the training speed and develops an efficient method for detecting the diversity of training signal, namely, gradient  ...  ] , the game of Go [185] , and natural language translation [157] .  ... 
arXiv:2112.01423v1 fatcat:3yqco7htnjdbng4hx2ilkrnkaq

A Neural Network Based on Synchronized Pairs of Nano-Oscillators [article]

Damir Vodenicarevic, Nicolas Locatelli, Damien Querlioz
2017 arXiv   pre-print
With the end of CMOS scaling and increasing demand for efficient neural networks, alternative architectures implementing neural functions efficiently are being studied.  ...  These results open the way for the design of alternative architectures adapted to efficient neural network execution.  ...  nano-oscillators [13] , an exponential-decay peak detector, and the standard gradient descent learning algorithm.  ... 
arXiv:1709.02274v1 fatcat:4u5bi2ntdfgwfmb4knsiua4c6e

Universal Adversarial Training [article]

Ali Shafahi, Mahyar Najibi, Zheng Xu, John Dickerson, Larry S. Davis, Tom Goldstein
2019 arXiv   pre-print
only 2X the cost of natural training.  ...  We study the efficient generation of universal adversarial perturbations, and also efficient methods for hardening networks to these attacks.  ...  Projected Gradient Descent (PGD) iteratively applies FGSM multiple times, and is one of the strongest per-instance attacks (Athalye, Carlini, and Wagner, 2018) .  ... 
arXiv:1811.11304v2 fatcat:okcbq7w2lfcnvkmxxgpxadz3gy

Approximated Geodesic Updates with Principal Natural Gradients

Zhirong Yang, Jorma Laaksonen
2007 Neural Networks (IJCNN), International Joint Conference on  
The proposed method demonstrates faster and more robust convergence in the simulations on recovering a Gaussian mixture of artificial data and on discriminative learning of ionosphere data.  ...  Second, we prove that dimensionality reduction of natural gradients is necessary for learning multidimensional linear transformations.  ...  Calculating the learning direction with only principal components of the natural gradients thus enhances both efficiency and robustness.  ... 
doi:10.1109/ijcnn.2007.4371149 dblp:conf/ijcnn/YangL07 fatcat:r6bbfwsz7zcixac5fzzefm3kxu

Online Learning Rate Adaptation with Hypergradient Descent [article]

Atilim Gunes Baydin, Robert Cornish, David Martinez Rubio, Mark Schmidt, Frank Wood
2018 arXiv   pre-print
Our method works by dynamically updating the learning rate during optimization using the gradient with respect to the learning rate of the update rule itself.  ...  We demonstrate the effectiveness of the method in a range of optimization problems by applying it to stochastic gradient descent, stochastic gradient descent with Nesterov momentum, and Adam, showing that  ...  INTRODUCTION In nearly all gradient descent algorithms the choice of learning rate remains central to efficiency; Bengio (2012) asserts that it is "often the single most important hyper-parameter" and  ... 
arXiv:1703.04782v3 fatcat:bh7nxmrxfnfdrag4fn2pxlapfa

Robust Estimation of Natural Gradient in Optimization by Regularized Linear Regression [chapter]

Luigi Malagò, Matteo Matteucci
2013 Lecture Notes in Computer Science  
The correspondence between the estimation of the natural gradient and solving a linear regression problem leads to the definition of regularized versions of the natural gradient.  ...  We propose a robust estimation of the natural gradient for the exponential family based on regularized least squares.  ...  of the natural gradient in steepest descent blackbox model-based search.  ... 
doi:10.1007/978-3-642-40020-9_97 fatcat:xqg2dsmjcjaltpimx4i5h46xva

SpikePropamine: Differentiable Plasticity in Spiking Neural Networks

Samuel Schmidgall, Julia Ashkanazy, Wallace Lawson, Joe Hays
2021 Frontiers in Neurorobotics  
gradient descent.  ...  Here, we introduce a framework for simultaneously learning the underlying fixed-weights and the rules governing the dynamics of synaptic plasticity and neuromodulated synaptic plasticity in SNNs through  ...  Here, the local variable η (l) acts as a free parameter and is learned through gradient descent.  ... 
doi:10.3389/fnbot.2021.629210 pmid:34630063 pmcid:PMC8493296 fatcat:5njutoheojau3idveemwyfnvoe
« Previous Showing results 1 — 15 out of 21,502 results