569 Hits in 4.4 sec

SCAFFOLD: Stochastic Controlled Averaging for Federated Learning [article]

Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh
2021 arXiv   pre-print
Federated Averaging (FedAvg) has emerged as the algorithm of choice for federated learning due to its simplicity and low communication cost.  ...  As a solution, we propose a new algorithm (SCAFFOLD) which uses control variates (variance reduction) to correct for the 'client-drift' in its local updates.  ...  We thank Filip Hanzely and Jakub Konečnỳ for discussions regarding variance reduction techniques and Blake Woodworth, Virginia Smith and Kumar Kshitij Patel for suggestions which improved the writing.  ... 
arXiv:1910.06378v4 fatcat:fueqnbenv5fylhqoarhiusi2km

WAFFLE: Weighted Averaging for Personalized Federated Learning [article]

Martin Beaussart, Felix Grimberg, Mary-Anne Hartley, Martin Jaggi
2021 arXiv   pre-print
We introduce WAFFLE (Weighted Averaging For Federated LEarning), a personalized collaborative machine learning algorithm that leverages stochastic control variates for faster convergence.  ...  Through a series of experiments, we compare our new approach to two recent personalized federated learning methods--Weight Erosion and APFL--as well as two general FL methods--Federated Averaging and SCAFFOLD  ...  Acknowledgements We thank David Roschewitz for sharing with us his implementation of APFL, and Freya Behrens for inspiring us on how to create synthetic concept shift.  ... 
arXiv:2110.06978v2 fatcat:vc4si452svagxlusw23lfwlm6m

Differentially Private Federated Learning on Heterogeneous Data [article]

Maxence Noble, Aurélien Bellet, Aymeric Dieuleveut
2021 arXiv   pre-print
Federated Learning (FL) is a paradigm for large-scale distributed learning which faces two key challenges: (i) efficient training from highly heterogeneous user data, and (ii) protecting the privacy of  ...  Using advanced results from DP theory, we establish the convergence of our algorithm for convex and non-convex objectives.  ...  Acknowledgments We thank Baptiste Goujaud and Constantin Philippenko for interesting discussions. The work of A. Dieuleveut is partially supported by ANR-19-CHIA-0002-01 /chaire SCAI, and Hi! Paris.  ... 
arXiv:2111.09278v1 fatcat:3bqaqtbclzbjrgy6hz6kvycfge

Adaptive Federated Optimization [article]

Sashank Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, H. Brendan McMahan
2021 arXiv   pre-print
Standard federated optimization methods such as Federated Averaging (FedAvg) are often difficult to tune and exhibit unfavorable convergence behavior.  ...  Federated learning is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data.  ...  SCAFFOLD: Stochastic controlled averaging for on-device federated learning. arXiv preprint arXiv:1910.06378, 2019. Ahmed Khaled, Konstantin Mishchenko, and Peter Richtárik.  ... 
arXiv:2003.00295v5 fatcat:dbgcdickyjhozetltc7agt5gj4

FedBE: Making Bayesian Model Ensemble Applicable to Federated Learning [article]

Hong-You Chen, Wei-Lun Chao
2021 arXiv   pre-print
Federated learning aims to collaboratively train a strong global model by accessing users' locally trained models but not their own data.  ...  learning algorithm intact.  ...  Ensemble learning and stochastic weight average.  ... 
arXiv:2009.01974v4 fatcat:svmvm4zxevcpjhwwxxjbhobyau

Federated Learning with Compression: Unified Analysis and Sharp Guarantees [article]

Farzin Haddadpour, Mohammad Mahdi Kamani, Aryan Mokhtari, Mehrdad Mahdavi
2020 arXiv   pre-print
In federated learning, communication cost is often a critical bottleneck to scale up distributed optimization algorithms to collaboratively learn a model from millions of devices with potentially unreliable  ...  For the homogeneous setting, our analysis improves existing bounds by providing tighter convergence rates for both strongly convex and non-convex objective functions.  ...  We also gratefully acknowledge the generous support of NVIDIA for providing GPUs for our research.  ... 
arXiv:2007.01154v2 fatcat:ojrm7w44wjdhfhf6hpfe5yy4ge

FedProc: Prototypical Contrastive Federated Learning on Non-IID data [article]

Xutong Mu, Yulong Shen, Ke Cheng, Xueli Geng, Jiaxuan Fu, Tao Zhang, Zhiwei Zhang
2021 arXiv   pre-print
In this paper, we propose FedProc: prototypical contrastive federated learning, which is a simple and effective federated learning framework.  ...  Federated learning allows multiple clients to collaborate to train high-performance deep learning models while keeping the training data locally.  ...  We also find that the average training time of FedProc on CIFAR-10 and CIFAR-100 is nearly the same as most of the methods (e.g., SCAFFOLD and FedProx).  ... 
arXiv:2109.12273v1 fatcat:wgnejjuaxzcpzllgu32viak7ym

Implicit Gradient Alignment in Distributed and Federated Learning [article]

Yatin Dandi, Luis Barba, Martin Jaggi
2021 arXiv   pre-print
A major obstacle to achieving global convergence in distributed and federated learning is the misalignment of gradients across clients, or mini-batches due to heterogeneity and stochasticity of the distributed  ...  We experimentally validate the benefits of our algorithm in different distributed and federated learning settings.  ...  Where the expectation is over random variables {ζ i,k } K k=1 controlling the stochasticity of the local updates for each client i.  ... 
arXiv:2106.13897v3 fatcat:nsu6wmomkzcmbi6y37bvnfmoe4

Server Averaging for Federated Learning [article]

George Pu, Yanlin Zhou, Dapeng Wu, Xiaolin Li
2021 arXiv   pre-print
In particular, federated learning converges slower than centralized training. We propose the server averaging algorithm to accelerate convergence.  ...  However, the improved privacy of federated learning also introduces challenges including higher computation and communication costs.  ...  Stochastic Weight Averaging (SWA) applies Averaged SGD to deep learning [2] .  ... 
arXiv:2103.11619v1 fatcat:nsysrprudrc3vpuia7z7tmhy5a

An Expectation-Maximization Perspective on Federated Learning [article]

Christos Louizos, Matthias Reisser, Joseph Soriaga, Max Welling
2021 arXiv   pre-print
in federated learning.  ...  By similarly using the hard-EM algorithm for learning, we obtain FedSparse, a procedure that can learn sparse neural networks in the federated learning setting.  ...  It should be noted that SCAFFOLD doubles the communication cost per round (compared to FedAvg) due to transmitting the control variates.  ... 
arXiv:2111.10192v1 fatcat:ppxl4jp2qjg4ddo7x73fz2lt7e

FedLGA: Towards System-Heterogeneity of Federated Learning via Local Gradient Approximation [article]

Xingyu Li, Zhe Qu, Bo Tang, Zhuo Lu
2021 arXiv   pre-print
Federated Learning (FL) is a decentralized machine learning architecture, which leverages a large number of remote devices to learn a joint model with distributed training data.  ...  ) + 1/T) and 𝒪( (1+ρ)√(E)/√(TK) + 1/T) for full and partial device participation respectively, where E is the number of local learning epoch, T is the number of total communication round, N is the total  ...  Suresh, “Scaffold: Stochastic controlled averaging for federated learning,” in International Conference on Machine Learning. PMLR, 2020, pp. 5132–5143. [18] Z. Qu, K. Lin, J. Kalagnanam, Z.  ... 
arXiv:2112.11989v1 fatcat:zfft7aztjze7lmokc7u5j5jabe

FedPAGE: A Fast Local Stochastic Gradient Method for Communication-Efficient Federated Learning [article]

Haoyu Zhao, Zhize Li, Peter Richtárik
2021 arXiv   pre-print
Federated Averaging (FedAvg, also known as Local-SGD) (McMahan et al., 2017) is a classical federated learning algorithm in which clients run multiple local SGD steps before communicating their update  ...  Note that in both settings, the communication cost for each round is the same for both FedPAGE and SCAFFOLD.  ...  SCAFFOLD: Stochastic controlled averaging for federated learning. In International Conference on Machine Learning, pages 5132-5143. PMLR, 2020.  ... 
arXiv:2108.04755v1 fatcat:tveuya2qlrgpza4naywcsqhxde

Fed-LAMB: Layerwise and Dimensionwise Locally Adaptive Optimization Algorithm [article]

Belhal Karimi, Xiaoyun Li, Ping Li
2021 arXiv   pre-print
In the emerging paradigm of federated learning (FL), large amount of clients, such as mobile devices, are used to train possibly high-dimensional models on their respective data.  ...  We present Fed-LAMB, a novel federated learning method based on a layerwise and dimensionwise updates of the local models, alleviating the nonconvexity and the multilayered nature of the optimization task  ...  controlling the dimension-wise learning rates.  ... 
arXiv:2110.00532v2 fatcat:2j7v7svkrrffpmi5pxgcnabmb4

Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing [article]

Mikhail Khodak, Renbo Tu, Tian Li, Liam Li, Maria-Florina Balcan, Virginia Smith, Ameet Talwalkar
2021 arXiv   pre-print
We first identify key challenges and show how standard approaches may be adapted to form baselines for the federated setting.  ...  Hyperparameter optimization is even more challenging in federated learning, where models are learned over a distributed network of heterogeneous devices; here, the need to keep data on device and perform  ...  1901403, SES-1919453, IIS-1705121, IIS-1838017, IIS-2046613 and IIS-2112471; the Defense Advanced Research Projects Agency under cooperative agreements HR00112020003 and FA875017C0141; an AWS Machine Learning  ... 
arXiv:2106.04502v2 fatcat:u5rk3wegubdrday3kbopjls7ga

LoSAC: An Efficient Local Stochastic Average Control Method for Federated Optimization [article]

Huiming Chen, Huandong Wang, Quanming Yao, Yong Li, Depeng Jin, Qiang Yang
2021 arXiv   pre-print
Federated optimization (FedOpt), which targets at collaboratively training a learning model across a large number of distributed clients, is vital for federated learning.  ...  Specifically, LoSAC significantly improves communication efficiency by more than 100% on average, mitigates the model divergence problem and equips with the defense ability against DLG.  ...  Local Stochastic Average Control 12: end for 13: Calculate: ∆x i ← x i − x and ∆φ i ← φ i − φ. 14: Client i transmits (∆x i , ∆φ i ) to the server. 15: end for As is discussed, FedSaga uses partial local  ... 
arXiv:2112.07839v2 fatcat:q45ds2zuevhgvb75umb2m4k6lu
« Previous Showing results 1 — 15 out of 569 results