Filters








30 Hits in 5.8 sec

Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL? [article]

Ammar Ahmad Awan, Ching-Hsiang Chu, Hari Subramoni, Dhabaleswar K. Panda
<span title="2017-07-28">2017</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
However, with the advent of MPI+CUDA applications and CUDA-Aware MPI runtimes like MVAPICH2 and OpenMPI, it has become important to address efficient communication schemes for such dense Multi-GPU nodes  ...  This coupled with new application workloads brought forward by Deep Learning frameworks like Caffe and Microsoft CNTK pose additional design constraints due to very large message communication of GPU buffers  ...  ACKNOWLEDGMENT This research is supported in part by National Science Foundation grants #CCF-1565414 and #CNS-1513120.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1707.09414v1">arXiv:1707.09414v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/lqh3x46v7jcqvkxjdasjxxxqda">fatcat:lqh3x46v7jcqvkxjdasjxxxqda</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200910053942/https://arxiv.org/pdf/1707.09414v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/d7/71/d771ce5fefb6e853ab176a09204556ae663e682f.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1707.09414v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

An Empirical Evaluation of Allgatherv on Multi-GPU Systems

Thomas B. Rolinger, Tyler A. Simon, Christopher D. Krieger
<span title="">2018</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ujjptpi7mjgmfdad3mk3fcer3y" style="color: black;">2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)</a> </i> &nbsp;
We then evaluate the communication performance of our tool when using traditional MPI, CUDA-aware MVAPICH and NCCL across a suite of real-world data sets on three different systems: a 16-node cluster with  ...  Applications for deep learning and big data analytics have compute and memory requirements that exceed the limits of a single GPU.  ...  [21] have compared the performance of the broadcast collective in NCCL and an extended version of MVAPICH-GDR for deep learning workloads.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ccgrid.2018.00027">doi:10.1109/ccgrid.2018.00027</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/ccgrid/RolingerSK18.html">dblp:conf/ccgrid/RolingerSK18</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/iobuxtw6vbehdiikbllavu43sm">fatcat:iobuxtw6vbehdiikbllavu43sm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200831164931/https://arxiv.org/pdf/1812.05964v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/cc/c3/ccc39a4b9345f0052210aac746a81536a825a1dc.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ccgrid.2018.00027"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI: Characterization, Designs, and Performance Evaluation [article]

Ammar Ahmad Awan, Jeroen Bedorf, Ching-Hsiang Chu, Hari Subramoni, and Dhabaleswar K. Panda
<span title="2018-10-25">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Finally, we propose a truly CUDA-Aware MPI Allreduce design that exploits CUDA kernels and pointer caching to perform large reductions efficiently.  ...  Our proposed designs offer 5-17X better performance than NCCL2 for small and medium messages, and reduces latency by 29% for large messages.  ...  The authors would like to thank Jonathan Perkins and Dr. Khaled Hamidouche for extending invaluable help in implementing the pointer cache design for MVAPICH2-GDR.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1810.11112v1">arXiv:1810.11112v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/2zztyhflwzfcvnaech35azvppe">fatcat:2zztyhflwzfcvnaech35azvppe</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20191013012902/https://arxiv.org/pdf/1810.11112v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/37/32/3732cf7ccd8d1a5ea9e6d063ffe0215fa90f09e7.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1810.11112v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Monitoring Collective Communication Among GPUs [article]

Muhammet Abdullah Soyturk, Palwisha Akhtar, Erhan Tezcan, Didem Unat
<span title="2021-10-20">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this work, we extend ComScribe to identify communication among GPUs for collective and P2P communication primitives in NVIDIA's NCCL library.  ...  In addition to P2P communications, collective communications are commonly used in HPC and AI workloads thus it is important to monitor the induced data movement due to collectives.  ...  Acknowledgement The work is supported by the Scientific and Technological Research Council of Turkey (TUBITAK), Grant no. 120E492. Dr.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2110.10401v1">arXiv:2110.10401v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/yz4wl3hlarfszensiokgn7mvze">fatcat:yz4wl3hlarfszensiokgn7mvze</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20211025182833/https://arxiv.org/pdf/2110.10401v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/72/24/722460994aa0339e84813c784cfc9f9e5b2cf388.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2110.10401v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

ADAPT

Xi Luo, Wei Wu, George Bosilca, Thananon Patinyasakdikul, Linnan Wang, Jack Dongarra
<span title="">2018</span> <i title="ACM Press"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/5c4wsoaagrh73mwzl3naxxgc3e" style="color: black;">Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing - HPDC &#39;18</a> </i> &nbsp;
In particular, we demonstrate at least 1.3× and 1.5× speedup for CPU data andand 10× speedup for GPU data using ADAPT event-based broadcast and reduce operations.  ...  Therefore, such design philosophy must be reconsidered to efficiently and robustly run on the large-scale heterogeneous platforms.  ...  For reduce operations on large messages, ADAPT performs better than most topology-aware algorithms in Intel MPI, except Shumulin's.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/3208040.3208054">doi:10.1145/3208040.3208054</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/hpdc/LuoWBPWD18.html">dblp:conf/hpdc/LuoWBPWD18</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cfebghog25cktm6qj2gzjkllru">fatcat:cfebghog25cktm6qj2gzjkllru</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180723121225/http://www.icl.utk.edu/files/publications/2018/icl-utk-1052-2018.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/28/d2/28d21d9228cb5f2819c933e045961c1c97fbe19e.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/3208040.3208054"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>

Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect [article]

Ang Li and Shuaiwen Leon Song and Jieyang Chen and Jiajia Li and Xu Liu and Nathan Tallent and Kevin Barker
<span title="2019-03-11">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
High performance multi-GPU computing becomes an inevitable trend due to the ever-increasing demand on computation capability in emerging domains such as deep learning, big data and planet-scale simulations  ...  These observations indicate that, for an application running in a multi-GPU node, choosing the right GPU combination can impose considerable impact on GPU communication efficiency, as well as the application's  ...  [60] proposed a pipelined chain design for MPI broadcast collective operations on multi-GPU nodes to facilitate various deep learning frameworks.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1903.04611v1">arXiv:1903.04611v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/n22jkoiu35e7dk5thtc4ot7liu">fatcat:n22jkoiu35e7dk5thtc4ot7liu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20191026043607/https://arxiv.org/pdf/1903.04611v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/fa/1d/fa1d787253779a3f1f14241c6b3139eea3805413.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1903.04611v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Exascale Deep Learning for Climate Analytics [article]

Thorsten Kurth, Sean Treichler, Joshua Romero, Mayur Mudigonda, Nathan Luehr, Everett Phillips, Ankur Mahesh, Michael Matheson, Jack Deslippe, Massimiliano Fatica, Prabhat, Michael Houston
<span title="2018-10-03">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit systems.  ...  We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks.  ...  More recently, work by Krizhevsky [24] opened the flood gates for modern day Deep Learning, showing impressive performance on hard vision tasks using large supervised deep networks.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1810.01993v1">arXiv:1810.01993v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/q7vmhkuxejaunoktkfm73tlpra">fatcat:q7vmhkuxejaunoktkfm73tlpra</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200827181024/https://arxiv.org/pdf/1810.01993v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c0/c6/c0c67cb93565316f947194b09db1bb93a5615510.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1810.01993v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Exascale Deep Learning for Climate Analytics

Thorsten Kurth, Sean Treichler, Joshua Romero, Mayur Mudigonda, Nathan Luehr, Everett Phillips, Ankur Mahesh, Michael Matheson, Jack Deslippe, Massimiliano Fatica, Prabhat Prabhat, Michael Houston
<span title="">2018</span> <i title="IEEE"> SC18: International Conference for High Performance Computing, Networking, Storage and Analysis </i> &nbsp;
We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit systems.  ...  We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks.  ...  More recently, work by Krizhevsky [24] opened the flood gates for modern day Deep Learning, showing impressive performance on hard vision tasks using large supervised deep networks.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/sc.2018.00054">doi:10.1109/sc.2018.00054</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/3gin7blvnzgezcnyh2wm5r4zae">fatcat:3gin7blvnzgezcnyh2wm5r4zae</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200313091700/https://escholarship.org/content/qt3wc2j1nx/qt3wc2j1nx.pdf?t=pjks42" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ad/d6/add64d027643410112cf9958f2b1582bfbbe610b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/sc.2018.00054"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

HPTMT Parallel Operators for High Performance Data Science Data Engineering [article]

Vibhatha Abeykoon, Supun Kamburugamuve, Chathura Widanage, Niranda Perera, Ahmet Uyar, Thejaka Amila Kanewala, Gregor von Laszewski, Geoffrey Fox
<span title="2021-08-13">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
This paper elaborates and illustrates this architecture using an end-to-end application with deep learning and data engineering parts working together.  ...  They are comprised of a rich set of sub-domains such as data engineering, deep learning, and machine learning.  ...  We thank the FutureSystems team for their infrastructure support.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2108.06001v1">arXiv:2108.06001v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/qbnz7lk4mffc5mccq3xzntxkym">fatcat:qbnz7lk4mffc5mccq3xzntxkym</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210826124634/https://arxiv.org/pdf/2108.06001v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ee/29/ee29c65d342bffb1285ccc9445b721176b63ffe0.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2108.06001v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Optimizing Network Performance for Distributed DNN Training on GPU Clusters: ImageNet/AlexNet Training in 1.5 Minutes [article]

Peng Sun, Wansen Feng, Ruobing Han, Shengen Yan, Yonggang Wen
<span title="2019-10-22">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
It is important to scale out deep neural network (DNN) training for reducing model training time.  ...  To address this problem, we propose a communication backend named GradientFlow for distributed DNN training, and employ a set of network optimization techniques.  ...  ACKNOWLEDGEMENT We gratefully acknowledge contributions from our colleagues within SenseTime Research and our collaborators from Cloud Application and Platform Lab, Nanyang Technological University Singapore  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1902.06855v3">arXiv:1902.06855v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/6dg4mczgfvcprm6cq533f3c7ca">fatcat:6dg4mczgfvcprm6cq533f3c7ca</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200912134918/https://arxiv.org/pdf/1902.06855v3.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f9/b1/f9b1ce99733551d6a8e6dd6066d30ef87646d47a.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1902.06855v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

A Novel Co-design Peta-scale Heterogeneous Cluster for Deep Learning Training [article]

Xin Chen and Hua Zhou and Yuxiang Gao and Yu Zhu
<span title="2018-05-18">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
It is key for researchers to own a great powerful computing platform to leverage deep learning (DL) advancing.On the other hand, as the commonly-used accelerator, the commodity GPUs cards of new generations  ...  Large scale deep Convolution Neural Networks (CNNs) increasingly demands the computing power.  ...  Many thanks to Super Micro Computer, Inc. for their help with building Manoa. References  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1802.02326v3">arXiv:1802.02326v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/e3lwpsdzaza27ayyhizp4clxpy">fatcat:e3lwpsdzaza27ayyhizp4clxpy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200901031700/https://arxiv.org/pdf/1802.02326v3.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/4d/9f/4d9fd9e26f302b104abab67d85c5f7ec3d21ce13.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1802.02326v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Distributed Deep Learning Strategies For Automatic Speech Recognition [article]

Wei Zhang, Xiaodong Cui, Ulrich Finkler, Brian Kingsbury, George Saon, David Kung, Michael Picheny
<span title="2019-04-10">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this paper, we propose and investigate a variety of distributed deep learning strategies for automatic speech recognition (ASR) and evaluate them with a state-of-the-art Long short-term memory (LSTM  ...  We first investigate what are the proper hyper-parameters (e.g., learning rate) to enable the training with sufficiently large batch size without impairing the model accuracy.  ...  We use the CUDA 9.2 compiler, the CUDA-aware OpenMPI 3.1.1, and g++ 4.8.5 compiler to build our communication library, which connects with PyTorch via a Python-C interface.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1904.04956v1">arXiv:1904.04956v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/fae4eb63e5ajdkcqcro2ren7uy">fatcat:fae4eb63e5ajdkcqcro2ren7uy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20191018192518/https://arxiv.org/pdf/1904.04956v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/d4/88/d488a7a8842fe45f8f1f6d5057a040d4ebbfec40.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1904.04956v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Machine Learning in Python: Main Developments and Technology Trends in Data Science, Machine Learning, and Artificial Intelligence

Sebastian Raschka, Joshua Patterson, Corey Nolet
<span title="2020-04-04">2020</span> <i title="MDPI AG"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/dmr4kpn2yreovpdxpdiqtjcrnu" style="color: black;">Information</a> </i> &nbsp;
Python continues to be the most preferred language for scientific computing, data science, and machine learning, boosting both performance and productivity by enabling the use of low-level libraries and  ...  We cover widely-used libraries and concepts, collected together for holistic comparison, with the goal of educating the reader and driving the field of Python machine learning forward.  ...  Even with CUDA-aware MPI, however, collective communication operations such as reductions and broadcasts, which allow a set of ranks to collectively participate in a data operation, are performed on the  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.3390/info11040193">doi:10.3390/info11040193</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/hetp7ngcpbbcpkhdcyowuiiwxe">fatcat:hetp7ngcpbbcpkhdcyowuiiwxe</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210208174826/https://res.mdpi.com/d_attachment/information/information-11-00193/article_deploy/information-11-00193-v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/68/01/6801476e722a777a2e703535eab5292c21da3be7.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.3390/info11040193"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> mdpi.com </button> </a>

Spark-MPI: Approaching the Fifth Paradigm of Cognitive Applications [article]

Nikolay Malitsky, Ralph Castain, Matt Cowan
<span title="2018-05-16">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The approach is demonstrated within the context of hybrid MPI/GPU ptychographic image reconstruction pipelines and distributed deep learning applications.  ...  Over the past decade, the fourth paradigm of data-intensive science rapidly became a major driving concept of multiple application domains encompassing and generating large-scale devices such as light  ...  Deep learning applications advanced the scope and requirements of large-scale scientific projects to the next level.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1806.01110v1">arXiv:1806.01110v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/x6mmqowmkje7heitxvhaf5fsui">fatcat:x6mmqowmkje7heitxvhaf5fsui</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200826044206/https://arxiv.org/pdf/1806.01110v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/8b/77/8b775a85b3d712f90f3951572a4fa2d43359d364.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1806.01110v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Communication Optimization Strategies for Distributed Deep Learning: A Survey [article]

Shuo Ouyang, Dezun Dong, Yemao Xu, Liquan Xiao
<span title="2020-03-06">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Recent trends in high-performance computing and deep learning lead to a proliferation of studies on large-scale deep neural network (DNN) training.  ...  Finally, we extrapolate potential challenges and research directions for communication acceleration in distributed DNN training.  ...  A series of studies by Awan et al. [107, 108] optimized Bcast operation based on NCCL and CUDA-Aware MPI, respectively.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2003.03009v1">arXiv:2003.03009v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/i2gwql7g5ve7rahug6ggd4p6kq">fatcat:i2gwql7g5ve7rahug6ggd4p6kq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200320164946/https://arxiv.org/pdf/2003.03009v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2003.03009v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 30 results