12,225 Hits in 3.9 sec

Bowao (Charles Zacharie), Critique(s), 1. Brazzaville : Éditions Hémar, coll. Horizons critiques, 2007, 96 p. – ISBN 978-2-915448-07-8

Michel Naumann
2008 Etudes littéraires africaines  
En effet, l'époque de rédaction des romans, les années 50, ne fait l'objet d'aucune étude précise qui permettrait Afrique noire francophone BOWAO (CHARLES ZACHARIE), CRITIQUE(S), 1.  ... 
doi:10.7202/1035140ar fatcat:ssl5qbwebbhnpkmrbysz5hz65a

On Large-Cohort Training for Federated Learning [article]

Zachary Charles, Zachary Garrett, Zhouyuan Huo, Sergei Shmulyian, Virginia Smith
2021 arXiv   pre-print
Federated learning methods typically learn a model by iteratively sampling updates from a population of clients. In this work, we explore how the number of clients sampled at each round (the cohort size) impacts the quality of the learned model and the training dynamics of federated learning algorithms. Our work poses three fundamental questions. First, what challenges arise when trying to scale federated learning to larger cohorts? Second, what parallels exist between cohort sizes in federated
more » ... learning and batch sizes in centralized learning? Last, how can we design federated learning methods that effectively utilize larger cohort sizes? We give partial answers to these questions based on extensive empirical evaluation. Our work highlights a number of challenges stemming from the use of larger cohorts. While some of these (such as generalization issues and diminishing returns) are analogs of large-batch training challenges, others (including training failures and fairness concerns) are unique to federated learning.
arXiv:2106.07820v1 fatcat:oyaejtpcfvgwri25lv6u76capa

Local Adaptivity in Federated Learning: Convergence and Consistency [article]

Jianyu Wang, Zheng Xu, Zachary Garrett, Zachary Charles, Luyang Liu, Gauri Joshi
2021 arXiv   pre-print
The federated learning (FL) framework trains a machine learning model using decentralized data stored at edge client devices by periodically aggregating locally trained models. Popular optimization algorithms of FL use vanilla (stochastic) gradient descent for both local updates at clients and global updates at the aggregating server. Recently, adaptive optimization methods such as AdaGrad have been studied for server updates. However, the effect of using adaptive optimization methods for local
more » ... updates at clients is not yet understood. We show in both theory and practice that while local adaptive methods can accelerate convergence, they can cause a non-vanishing solution bias, where the final converged solution may be different from the stationary point of the global objective function. We propose correction techniques to overcome this inconsistency and complement the local adaptive methods for FL. Extensive experiments on realistic federated training tasks show that the proposed algorithms can achieve faster convergence and higher test accuracy than the baselines without local adaptivity.
arXiv:2106.02305v1 fatcat:hph4vle6ibeanksuvwgovwhl4i

What Causes Care Coordination Problems? A Case for Microanalysis

Wayne Zachary, Russell Charles Maulitz, Drew A. Zachary
2016 eGEMs  
We gratefully acknowledge and thank Chioma Onyekwelu, Elissa Iverson, Zachary Risler, and Lauren Zenel who collected Acknowledgements The research reported in this publication was supported by the National  ...  We gratefully acknowledge and thank Chioma Onyekwelu, Elissa Iverson, Zachary Risler, and Lauren Zenel who collected and compiled the field data on care coordination communication in the preceding studies  ...  The tasking-note functionality in many commercial EHRs can capture data on some of the within-EHR communication links (about 50 percent is from data reported in Zachary, Maulitz, Iverson, Onyekwelu, Risler  ... 
doi:10.13063/2327-9214.1230 pmid:27563685 pmcid:PMC4975569 fatcat:3aiopnrldbbs5pc45mwibfqx2i

Gradient Coding via the Stochastic Block Model [article]

Zachary Charles, Dimitris Papailiopoulos
2018 arXiv   pre-print
Gradient descent and its many variants, including mini-batch stochastic gradient descent, form the algorithmic foundation of modern large-scale machine learning. Due to the size and scale of modern data, gradient computations are often distributed across multiple compute nodes. Unfortunately, such distributed implementations can face significant delays caused by straggler nodes, i.e., nodes that are much slower than average. Gradient coding is a new technique for mitigating the effect of
more » ... ers via algorithmic redundancy. While effective, previously proposed gradient codes can be computationally expensive to construct, inaccurate, or susceptible to adversarial stragglers. In this work, we present the stochastic block code (SBC), a gradient code based on the stochastic block model. We show that SBCs are efficient, accurate, and that under certain settings, adversarial straggler selection becomes as hard as detecting a community structure in the multiple community, block stochastic graph model.
arXiv:1805.10378v1 fatcat:fyoxrnqelreexg3ywfhdegxgvm

Approximate Gradient Coding via Sparse Random Graphs [article]

Zachary Charles, Dimitris Papailiopoulos, Jordan Ellenberg
2017 arXiv   pre-print
Distributed algorithms are often beset by the straggler effect, where the slowest compute nodes in the system dictate the overall running time. Coding-theoretic techniques have been recently proposed to mitigate stragglers via algorithmic redundancy. Prior work in coded computation and gradient coding has mainly focused on exact recovery of the desired output. However, slightly inexact solutions can be acceptable in applications that are robust to noise, such as model training via
more » ... algorithms. In this work, we present computationally simple gradient codes based on sparse graphs that guarantee fast and approximately accurate distributed computation. We demonstrate that sacrificing a small amount of accuracy can significantly increase algorithmic robustness to stragglers.
arXiv:1711.06771v1 fatcat:zce7rpmdjndhhnliw4437dsyoa

Foliar Micronutrient Application for High-Yield Maize

Zachary P. Stewart, Ellen T. Paparozzi, Charles S. Wortmann, Prakash Kumar Jha, Charles A. Shapiro
2020 Agronomy  
Nebraska soils are generally micronutrient sufficient. However, critical levels for current yields have not been validated. From 2013 to 2015, 26 on-farm paired comparison strip-trials were conducted across Nebraska to test the effect of foliar-applied micronutrients on maize (Zea mays L.) yield and foliar nutrient concentrations. Treatments were applied from V6 to V14 at sites with 10.9 to 16.4 Mg ha−1 yield. Soils ranged from silty clays to fine sands. Soil micronutrient availability and
more » ... e concentrations were all above critical levels for deficiency. Significant grain yield increases were few. Micronutrient concentrations for leaf growth that occurred after foliar applications were increased 4 to 9 mg Zn kg−1 at 5 of 17 sites with application of 87 to 119 g Zn ha−1, 12 to 16 mg kg−1 Mn at 2 of 17 sites with application of 87 to 89 g Mn ha−1, and an average of 8.1 mg kg−1 Fe across 10 sites showing signs of Fe deficiency with application of 123 g foliar Fe ha−1. Foliar B concentration was not affected by B application. Increases in nutrient concentrations were not related to grain yield responses except for Mn (r = 0.54). The mean, significant grain yield response to 123 g foliar Fe ha−1 was 0.4 Mg ha−1 for the 10 sites with Fe deficiency symptoms. On average, maize yield response to foliar Fe application can be profitable if Fe deficiency symptoms are observed. Response to other foliar micronutrient applications is not likely to be profitable without solid evidence of a nutrient deficiency.
doi:10.3390/agronomy10121946 fatcat:gvouia7g2rbb5narvol62idhcu

Adaptive Federated Optimization [article]

Sashank Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, H. Brendan McMahan
2021 arXiv   pre-print
Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Keith Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al.  ... 
arXiv:2003.00295v5 fatcat:dbgcdickyjhozetltc7agt5gj4

A Geometric Perspective on the Transferability of Adversarial Directions [article]

Zachary Charles, Harrison Rosenberg, Dimitris Papailiopoulos
2018 arXiv   pre-print
State-of-the-art machine learning models frequently misclassify inputs that have been perturbed in an adversarial manner. Adversarial perturbations generated for a given input and a specific classifier often seem to be effective on other inputs and even different classifiers. In other words, adversarial perturbations seem to transfer between different inputs, models, and even different neural network architectures. In this work, we show that in the context of linear classifiers and two-layer
more » ... U networks, there provably exist directions that give rise to adversarial perturbations for many classifiers and data points simultaneously. We show that these "transferable adversarial directions" are guaranteed to exist for linear separators of a given set, and will exist with high probability for linear classifiers trained on independent sets drawn from the same distribution. We extend our results to large classes of two-layer ReLU networks. We further show that adversarial directions for ReLU networks transfer to linear classifiers while the reverse need not hold, suggesting that adversarial perturbations for more complex models are more likely to transfer to other classifiers. We validate our findings empirically, even for deeper ReLU networks.
arXiv:1811.03531v1 fatcat:e2cribnoarfbzj22rxch37qufm

Differential Privacy and Machine Learning: a Survey and Review [article]

Zhanglong Ji, Zachary C. Lipton, Charles Elkan
2014 arXiv   pre-print
The objective of machine learning is to extract useful information from data, while privacy is preserved by concealing information. Thus it seems hard to reconcile these competing interests. However, they frequently must be balanced when mining sensitive data. For example, medical research represents an important application where it is necessary both to extract useful information and protect patient privacy. One way to resolve the conflict is to extract general characteristics of whole
more » ... ons without disclosing the private information of individuals. In this paper, we consider differential privacy, one of the most popular and powerful definitions of privacy. We explore the interplay between machine learning and differential privacy, namely privacy-preserving machine learning algorithms and learning-based data release mechanisms. We also describe some theoretical results that address what can be learned differentially privately and upper bounds of loss functions for differentially private algorithms. Finally, we present some open questions, including how to incorporate public data, how to deal with missing data in private datasets, and whether, as the number of observed samples grows arbitrarily large, differentially private machine learning algorithms can be achieved at no cost to utility as compared to corresponding non-differentially private algorithms.
arXiv:1412.7584v1 fatcat:voe3w3rqkncpjemju2rbqun3ti

Generating Random Factored Ideals in Number Fields [article]

Zachary Charles
2017 arXiv   pre-print
We present a randomized polynomial-time algorithm to generate a random integer according to the distribution of norms of ideals at most N in any given number field, along with the factorization of the integer. Using this algorithm, we can produce a random ideal in the ring of algebraic integers uniformly at random among ideals with norm up to N, in polynomial time. We also present a variant of this algorithm for generating ideals in function fields.
arXiv:1612.06260v2 fatcat:7a2l2c6ubvbhnnvcg7fk3ql2ce

Does Data Augmentation Lead to Positive Margin? [article]

Shashank Rajput, Zhili Feng, Zachary Charles, Po-Ling Loh, Dimitris Papailiopoulos
2019 arXiv   pre-print
Data augmentation (DA) is commonly used during model training, as it significantly improves test error and model robustness. DA artificially expands the training set by applying random noise, rotations, crops, or even adversarial perturbations to the input data. Although DA is widely used, its capacity to provably improve robustness is not fully understood. In this work, we analyze the robustness that DA begets by quantifying the margin that DA enforces on empirical risk minimizers. We first
more » ... us on linear separators, and then a class of nonlinear models whose labeling is constant within small convex hulls of data points. We present lower bounds on the number of augmented data points required for non-zero margin, and show that commonly used DA techniques may only introduce significant margin after adding exponentially many points to the data set.
arXiv:1905.03177v1 fatcat:n4adtd2u55d6lbkmvs5wewwrpy

Stability and Generalization of Learning Algorithms that Converge to Global Optima [article]

Zachary Charles, Dimitris Papailiopoulos
2017 arXiv   pre-print
We establish novel generalization bounds for learning algorithms that converge to global minima. We do so by deriving black-box stability results that only depend on the convergence of a learning algorithm and the geometry around the minimizers of the loss function. The results are shown for nonconvex loss functions satisfying the Polyak-Łojasiewicz (PL) and the quadratic growth (QG) conditions. We further show that these conditions arise for some neural networks with linear activations. We use
more » ... our black-box results to establish the stability of optimization algorithms such as stochastic gradient descent (SGD), gradient descent (GD), randomized coordinate descent (RCD), and the stochastic variance reduced gradient method (SVRG), in both the PL and the strongly convex setting. Our results match or improve state-of-the-art generalization bounds and can easily be extended to similar optimization algorithms. Finally, we show that although our results imply comparable stability for SGD and GD in the PL setting, there exist simple neural networks with multiple local minima where SGD is stable but GD is not.
arXiv:1710.08402v1 fatcat:qjd43lfrkverfhyuynm3bpa2zq

On the Outsized Importance of Learning Rates in Local Update Methods [article]

Zachary Charles, Jakub Konečný
2020 arXiv   pre-print
Brendan McMahan and Zachary Garrett for fruitful discussions about decoupling client and server learning rates in federated learning.  ... 
arXiv:2007.00878v1 fatcat:bxzwo7zykngf7g6d536dfq4uau

Exploiting Algebraic Structure in Global Optimization and the Belgian Chocolate Problem [article]

Zachary Charles, Nigel Boston
2017 arXiv   pre-print
The Belgian chocolate problem involves maximizing a parameter δ over a non-convex region of polynomials. In this paper we detail a global optimization method for this problem that outperforms previous such methods by exploiting underlying algebraic structure. Previous work has focused on iterative methods that, due to the complicated non-convex feasible region, may require many iterations or result in non-optimal δ. By contrast, our method locates the largest known value of δ in a non-iterative
more » ... manner. We do this by using the algebraic structure to go directly to large limiting values, reducing the problem to a simpler combinatorial optimization problem. While these limiting values are not necessarily feasible, we give an explicit algorithm for arbitrarily approximating them by feasible δ. Using this approach, we find the largest known value of δ to date, δ = 0.9808348. We also demonstrate that in low degree settings, our method recovers previously known upper bounds on δ and that prior methods converge towards the δ we find.
arXiv:1708.08114v1 fatcat:oh7ldtykqjdardep55vgij7lwq
« Previous Showing results 1 — 15 out of 12,225 results