Filters








27,283 Hits in 8.2 sec

99% of Worker-Master Communication in Distributed Optimization Is Not Needed

Konstantin Mishchenko, Filip Hanzely, Peter Richtárik
2020 Conference on Uncertainty in Artificial Intelligence  
In this paper we discuss sparsification of worker-to-server communication in large distributed systems.  ...  As an illustration, this means that when n = 100 parallel workers are used, the communication of 99% blocks is redundant, and hence a waste of time.  ...  In this case, the 'reduce' operation is not efficient and one needs to communicate data by sending positions of nonzeros and their values.  ... 
dblp:conf/uai/MishchenkoHR20 fatcat:qcr5usi3ujbp7fnvrn2lqu5dni

A Survey of Coded Distributed Computing [article]

Jer Shyuan Ng, Wei Yang Bryan Lim, Nguyen Cong Luong, Zehui Xiong, Alia Asheralieva, Dusit Niyato, Cyril Leung, Chunyan Miao
2020 arXiv   pre-print
This results in a longer overall time needed to execute the computation tasks, thereby limiting the performance of distributed computing.  ...  In particular, computing nodes need to exchange intermediate results with each other in order to calculate the final result, and this significantly increases communication overheads.  ...  SECURE CODING FOR DISTRIBUTED COMPUTING In distributed computing, the data owner, master node, and workers may not belong to the same entity.  ... 
arXiv:2008.09048v1 fatcat:riy4dxvuc5ae3krz7lf25zkg6m

Performance Modeling and Analysis of a Massively Parallel Direct—Part 1

Jian He, Alex Verstak, L.T. Watson, M. Sosonkina
2009 The international journal of high performance computing applications  
The goal is to (1) ensure the design effectiveness of pDIRECT II on a variety of problems and systems, (2) guide the proper choice of optimization parameter inputs specified by users, and (3) describe  ...  Modeling and analysis techniques are used to investigate the performance of a massively parallel version of DIRECT, a global search algorithm widely used in multidisciplinary design optimization applications  ...  ACKNOWLEDGMENTS This work was supported in part by National Science Foundation Grant DMI-0355391, Department of Energy Grant DE-FG02-06ER25720, and NIGMS/NIH Grant 1 R01 GM078989-01.  ... 
doi:10.1177/1094342008098462 fatcat:g7ajvbafmnalhg6ixjxorbkumy

Performance Modeling and Analysis of a Massively Parallel Direct—Part 2

Jian He, Alex Verstak, M. Sosonkina, L.T. Watson
2009 The international journal of high performance computing applications  
The goal is to (1) ensure the design effectiveness of pDIRECT II on a variety of problems and systems, (2) guide the proper choice of optimization parameter inputs specified by users, and (3) describe  ...  Modeling and analysis techniques are used to investigate the performance of a massively parallel version of DIRECT, a global search algorithm widely used in multidisciplinary design optimization applications  ...  ACKNOWLEDGMENTS This work was supported in part by National Science Foundation Grant DMI-0355391, Department of Energy Grant DE-FG02-06ER25720, and NIGMS/NIH Grant 1 R01 GM078989-01.  ... 
doi:10.1177/1094342008098463 fatcat:flezzx2onzfgzaxj56477ffg74

Distributed Linearly Separable Computation [article]

Kai Wan and Hua Sun and Mingyue Ji and Giuseppe Caire
2021 arXiv   pre-print
Our objective is to find the optimal tradeoff between the computation cost (number of uncoded datasets assigned to each worker) and the communication cost (number of symbols the master must download),  ...  In this paper, we consider the specific case where the computation cost is minimum, and propose novel achievability schemes and converse bounds for the optimal communication cost.  ...  with dimension 2 × 2 which is not full-rank, the optimal communication cost is 3; • otherwise, the optimal communication cost is 2.  ... 
arXiv:2007.00345v2 fatcat:wt5nwdefy5g3hm35uhktbwxf24

Private Coded Computation for Machine Learning [article]

Minchul Kim, Heecheol Yang, Jungwoo Lee
2018 arXiv   pre-print
In private coded computation, the master needs to compute a function of its own dataset and one of the datasets in a library exclusively shared by the external workers.  ...  In a distributed computing system for the master-worker framework, an erasure code can mitigate the effects of slow workers, also called stragglers.  ...  The master needs distributed computing on a function f of A and one of M datasets {B k } M k=1 in library B, where f : (V 1 , V 2 ) → V 3 for a vector space V 3 over the same field F.  ... 
arXiv:1807.01170v3 fatcat:em3sqabbirarnbl43yy6feu7be

Distributed model validation with Epsilon

Sina Madani, Dimitris Kolovos, Richard F. Paige
2021 Journal of Software and Systems Modeling  
In this paper, we demonstrate a low-overhead data-parallel approach for distributed model validation in the context of an OCL-like language.  ...  Our approach minimises communication costs by exploiting the deterministic structure of programs and can take advantage of multiple cores on each (heterogeneous) machine with highly configurable computational  ...  Acknowledgements The work in this paper was supported by the European Commission via the CROSSMINER H2020 Project (Grant #732223).  ... 
doi:10.1007/s10270-021-00878-x fatcat:zsseexwx2vdt7cxu36jqncvyiy

Natural Compression for Distributed Deep Learning [article]

Samuel Horvath, Chen-Yu Ho, Ludovit Horvath, Atal Narayan Sahu, Marco Canini, Peter Richtarik
2020 arXiv   pre-print
training algorithms, such as distributed SGD, is negligible.  ...  Modern deep learning models are often trained in parallel over a collection of distributed machines to reduce training time.  ...  MODEL 2 For the second model, we assume that the master communicates much faster than workers thus communication from workers is the bottleneck and we don't need to compress updates after aggregation,  ... 
arXiv:1905.10988v2 fatcat:cwk3l74cfjag5nt6v6nqlleuci

Scalability Analysis of the Asynchronous, Master-Slave Borg Multiobjective Evolutionary Algorithm

David Hadka, Kamesh Madduri, Patrick Reed
2013 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum  
The Borg Multiobjective Evolutionary Algorithm (MOEA) is a new, efficient, and robust optimizer that outperforms competing optimization methods on numerous complex engineering problems.  ...  Problems from these domains often involve expensive design evaluations that require large-scale parallel algorithms to produce results in a reasonable amount of time.  ...  ACKNOWLEDGMENT This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number OCI-1053575.  ... 
doi:10.1109/ipdpsw.2013.160 dblp:conf/ipps/HadkaMR13 fatcat:fplqf6rd5bcpdmdl3v3lp32ez4

DataMill

Augusto Born de Oliveira, Jean-Christophe Petkovich, Thomas Reidemeister, Sebastian Fischmeister
2013 Proceedings of the ACM/SPEC international conference on International conference on performance engineering - ICPE '13  
Empirical systems research is facing a dilemma.  ...  So how can one trust any reported empirical analysis of a new idea or concept in computer science?  ...  ACKNOWLEDGMENTS This research was supported in part by NSERC DG 357121-2008, ORF-RE03-045, ORF-RE04-036, ORF-RE04-039, APCPJ 386797-09, CFI 20314 and CMC, ISOP IS09-06-037, and the industrial partners  ... 
doi:10.1145/2479871.2479892 dblp:conf/wosp/OliveiraPRF13 fatcat:jtwst63t2zaxlhdukwptphwwha

Clustered Workflow Execution of Retargeted Data Analysis Scripts

Daniel L. Wang, Charles S. Zender, Stephen F. Jenks
2008 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID)  
We show how dataflow and other compiler-inspired analyses of shell scripts of scientists' most common analysis tools enables parallelization and optimizations in disk and network I/O bandwidth.  ...  We benchmark using an actual geoscience analysis script, illustrating the crucial performance gains of extracting workflows defined in scripts and optimizing their execution.  ...  This will reduce the communications between master and worker and should reduce file exchange between workers.  ... 
doi:10.1109/ccgrid.2008.69 dblp:conf/ccgrid/WangZJ08 fatcat:xfnqicnaevfmxapgi36h4j2hpe

Communication Optimization Strategies for Distributed Deep Learning: A Survey [article]

Shuo Ouyang, Dezun Dong, Yemao Xu, Liquan Xiao
2020 arXiv   pre-print
Algorithm optimizations focus on reducing the amount of communication in distributed training, while network optimizations focus on speeding up the communication between distributed devices.  ...  To mitigate the drawbacks of distributed communication, researchers have proposed various optimization strategies.  ...  The parameter server is usually run top of ZeroMQ [98] or gRPC [99] .  ... 
arXiv:2003.03009v1 fatcat:i2gwql7g5ve7rahug6ggd4p6kq

Efficient Master/Worker Parallel Discrete Event Simulation

Alfred Park, Ric Fujimoto
2009 2009 ACM/IEEE/SCS 23rd Workshop on Principles of Advanced and Distributed Simulation  
It has truly been an honor and a blessing to perform research under the supervision of one of the pioneers in the parallel and distributed simulation field.  ...  I feel I have learned a great deal not only from his advisement and expertise but also through the exposure to a variety of projects that I was able to participate in.  ...  In master/worker systems, this is not possible due to the strict restriction that clients may not communicate with each other.  ... 
doi:10.1109/pads.2009.9 dblp:conf/pads/ParkF09 fatcat:6aiga6uc6jdy5adfzhujiqab3a

Abusing cloud-based browsers for fun and profit

Vasant Tendulkar, Ryan Snyder, Joe Pletcher, Kevin Butler, Ashwin Shashidharan, William Enck
2012 Proceedings of the 28th Annual Computer Security Applications Conference on - ACSAC '12  
In response to the surge of smartphones and mobile devices, several cloud-based Web browsers have become commercially available.  ...  We implement and test three canonical MapReduce applications (word count, distributed grep, and distributed sort).  ...  Acknowledgements This work is supported in part by the National Science Foundation under awards CNS-1118046 and CNS-1222680, as well as in part by the U.S.  ... 
doi:10.1145/2420950.2420984 dblp:conf/acsac/TendulkarSPBSE12 fatcat:6yf3bwtatvddfnfhrlxgmijb3u

Perfect Load Balancing for Demand- Driven Parallel Ray Tracing [chapter]

Tomas Plachetka
2002 Lecture Notes in Computer Science  
A distributed object database allows rendering of complex scenes which cannot be stored in the memory of a single processor.  ...  Correctness and optimality of a perfect load balancing algorithm for image space subdivision are proved and its exact message complexity is given.  ...  The initial distribution of objects on processors is quasi-random. The master process is the first process which parses the scene.  ... 
doi:10.1007/3-540-45706-2_56 fatcat:jq4jjzceuzb6dcuoq4ly6f2twm
« Previous Showing results 1 — 15 out of 27,283 results