A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Byzantine-Resilient Stochastic Gradient Descent for Distributed Learning: A Lipschitz-Inspired Coordinate-wise Median Approach
[article]
2019
arXiv
pre-print
Toward this end, we propose a new Lipschitz-inspired coordinate-wise median approach (LICM-SGD) to mitigate Byzantine attacks. ...
In this work, we consider the resilience of distributed algorithms based on stochastic gradient descent (SGD) in distributed learning with potentially Byzantine attackers, who could send arbitrary information ...
In [11] , the authors proposed to let the parameter server keep a small set of data to compute an estimate of the true gradient, which is used as a benchmark to filter out suspicious gradients. ...
arXiv:1909.04532v1
fatcat:ullh3kooerfvbdtif5dxx7f6dm
Byzantine Fault Tolerance in Distributed Machine Learning : a Survey
[article]
2022
arXiv
pre-print
In this paper, we present a survey of recent works surrounding BFT in DML. Mainly in first-order optimization methods, especially Stochastic Gradient Descent (SGD). ...
Byzantine Fault Tolerance (BFT) is among the most challenging problems in Distributed Machine Learning (DML). ...
[152] proposed Lipschitz-inspired coordinate-wise median (LICM-SGD) to mitigate Byzantine attacks. ...
arXiv:2205.02572v1
fatcat:h2hkcgz3w5cvrnro6whl2rpvby
Genuinely Distributed Byzantine Machine Learning
[article]
2020
arXiv
pre-print
We present a new algorithm, ByzSGD, which solves the general Byzantine-resilient distributed machine learning problem by relying on three major schemes. ...
We initiate in this paper the study of the "general" Byzantine-resilient distributed machine learning problem where no individual component is trusted. ...
Most experiments presented in this paper were carried out using the Grid'5000 testbed, supported by a scienti c interest group hosted by Inria and including CNRS, RENATER and several Universities as well ...
arXiv:1905.03853v2
fatcat:u6irl56wsregref72p74napnka
Tolerating Adversarial Attacks and Byzantine Faults in Distributed Machine Learning
[article]
2021
arXiv
pre-print
In this paper, we propose a novel distributed training algorithm, partial synchronous stochastic gradient descent (ParSGD), which defends adversarial attacks and/or tolerates Byzantine faults. ...
In addition, Byzantine faults including software, hardware, network issues occur in distributed systems which also lead to a negative impact on the prediction outcome. ...
Definition 1 (Coordinate-wise median): For vectors V i in R d , i ∈ [1, n], the coordinate-wise median g = med{V i : i ∈ [1, n]} is a vector with its k-th coordinate being g k = med{V k i : i ∈ [1, n]} ...
arXiv:2109.02018v1
fatcat:54ndlvp455ainklmvemlgwunaq
Holdout SGD: Byzantine Tolerant Federated Learning
[article]
2020
arXiv
pre-print
This work presents a new distributed Byzantine tolerant federated learning algorithm, HoldOut SGD, for Stochastic Gradient Descent (SGD) optimization. ...
We propose two possible mechanisms for the coordination of workers in the distributed computation of HoldOut SGD. ...
Similarly, the Coordinate-wise Trimmed Mean method [9] uses an aggregation which evaluates a robust mean around the median. ...
arXiv:2008.04612v1
fatcat:kxkapnl2draxbmhyfgbm2zhjpm
Learning from History for Byzantine Robust Optimization
[article]
2021
arXiv
pre-print
Byzantine robustness has received significant attention recently given its importance for distributed and federated learning. ...
This is the first provably robust method for the standard stochastic optimization setting. Our code is open sourced at https://github.com/epfml/byzantine-robust-optimizer. ...
We thank Eduard Gorbunov and Dan Alistarh for comments on our earlier drafts. We are partly supported by a Google Focused Research Award. ...
arXiv:2012.10333v3
fatcat:vgpkiqzb75aetc3ipaniocjvmu
Byzantine Resilient Distributed Multi-Task Learning
[article]
2021
arXiv
pre-print
In this paper, we present an approach for Byzantine resilient distributed multi-task learning. ...
distributed algorithms for learning relatedness among tasks are not resilient in the presence of Byzantine agents. ...
] , the coordinate-wise median [29, 30, 31] , the geometric median [32, 33] , and the Krum algorithm [34] . ...
arXiv:2010.13032v2
fatcat:x5i3nlirvjcatfjohs66g2jwmy
Resilient Consensus-based Multi-agent Reinforcement Learning with Function Approximation
[article]
2021
arXiv
pre-print
We propose a resilient consensus-based actor-critic algorithm, whereby each agent estimates the team-average reward and value function, and communicates the associated parameter vectors to its immediate ...
Adversarial attacks during training can strongly influence the performance of multi-agent reinforcement learning algorithms. ...
Acknowledgments We thank Krishna Chaitanya Kosaraju for valuable discussion about the code implementation. ...
arXiv:2111.06776v2
fatcat:mte4ncfaq5dndc4ihvek4leao4
ByGARS: Byzantine SGD with Arbitrary Number of Attackers
[article]
2020
arXiv
pre-print
We propose two novel stochastic gradient descent algorithms, ByGARS and ByGARS++, for distributed machine learning in the presence of any number of Byzantine adversaries. ...
This reputation score is then used for aggregating the gradients for stochastic gradient descent. ...
These approaches relied on techniques like majority voting, geometric median, median of means, coordinate wise median, coordinate wise trimmed mean, etc. to aggregate gradients at the server. ...
arXiv:2006.13421v2
fatcat:tiz7nkd6uvgctojqahcfrwwqwi
BRIDGE: Byzantine-resilient Decentralized Gradient Descent
[article]
2022
arXiv
pre-print
In this paper, a scalable, Byzantine-resilient decentralized machine learning framework termed Byzantine-resilient decentralized gradient descent (BRIDGE) is introduced. ...
But the study of Byzantine resilience within decentralized learning, in contrast to distributed learning, is still in its infancy. ...
A necessarily incomplete list of these works, most of which have developed and analyzed Byzantine-resilient distributed learning approaches from the perspective of stochastic gradient descent, include ...
arXiv:1908.08098v2
fatcat:uh7wdsvotzbkppmele2pchk5kq
Securing Distributed Gradient Descent in High Dimensional Statistical Learning
[article]
2019
arXiv
pre-print
We propose a secured variant of the gradient descent method that can tolerate up to a constant fraction of Byzantine workers, i.e., q/m = O(1). ...
We consider unreliable distributed learning systems wherein the training data is kept confidential by external workers, and the learner has to interact closely with those workers to train a model. ...
Su was supported in part by the NSF Science & Technology Center for Science of Information Grant CCF-0939370. J. Xu was supported in part by the NSF Grant CCF-1755960. ...
arXiv:1804.10140v3
fatcat:pdbv2gvvozei7premipvw27yze
Byzantine-Resilient SGD in High Dimensions on Heterogeneous Data
[article]
2020
arXiv
pre-print
We study distributed stochastic gradient descent (SGD) in the master-worker architecture under Byzantine attacks. ...
We also propose and analyze a Byzantine-resilient SGD algorithm with gradient compression, where workers send k random coordinates of their gradients. ...
is bounded by O κ 2 mean + d n . 3 [YCRB18] employed coordinate-wise median and trimmed median, and got an approximation error of O d 2 nR for both convex and non-convex objectives, which could be prohibitive ...
arXiv:2005.07866v1
fatcat:ixlbiuqipfcztchkwaewrej77q
Tolerating Adversarial Attacks and Byzantine Faults in Distributed Machine Learning
[article]
2021
In this paper, we propose a novel distributed training algorithm, partial synchronous stochastic gradient descent (ParSGD), which defends adversarial attacks and/or tolerates Byzantine faults. ...
In addition, Byzantine faults including software, hardware, network issues occur in distributed systems which also lead to a negative impact on the prediction outcome. ...
Definition 1 (Coordinate-wise median): For vectors V i in R d , i ∈ [1, n], the coordinate-wise median g = med{V i : i ∈ [1, n]} is a vector with its k-th coordinate being g k = med{V k i : i ∈ [1, n]} ...
doi:10.13016/m2nnqk-wqj5
fatcat:er4p4kdpafdn5ekxovgdbjydmu