37 Hits in 4.6 sec

iWARP redefined: Scalable connectionless communication over high-speed Ethernet

Mohammad J. Rashti, Ryan E. Grant, Ahmad Afsahi, Pavan Balaji
2010 2010 International Conference on High Performance Computing  
Our microbenchmark and MPI application results show performance and memory usage benefits for MPI applications, promoting the use of datagram-iWARP for large-scale HPC applications.  ...  iWARP represents the leading edge of high performance Ethernet technologies.  ...  Specifically, the communication libraries such as MPI pre-allocate memory buffers per connection to be used for fast buffering and communication management [14] . 2) Performance.  ... 
doi:10.1109/hipc.2010.5713192 dblp:conf/hipc/RashtiGAB10 fatcat:3ioiy6tcbbcyhfi36jdpmqbwfq

Spark-MPI: Approaching the Fifth Paradigm of Cognitive Applications [article]

Nikolay Malitsky, Ralph Castain, Matt Cowan
2018 arXiv   pre-print
The success of data-intensive projects subsequently triggered the next generation of machine learning approaches.  ...  Data processing platforms and HPC technologies.  ...  SHARP is a high-performance distributed ptychographic solver using GPU kernels and the MPI protocol.  ... 
arXiv:1806.01110v1 fatcat:x6mmqowmkje7heitxvhaf5fsui

Building Near-Real-Time Processing Pipelines with the Spark-MPI Platform [article]

Nikolay Malitsky, Aashish Chaudhary, Sebastien Jourdain, Matt Cowan, Patrick O'Leary, Marcus Hanwell, Kerstin Kleese Van Dam
2018 arXiv   pre-print
Recently, this demand was addressed by the Spark-MPI approach connecting the Spark data-intensive platform with the MPI high-performance framework.  ...  Advances in detectors and computational technologies provide new opportunities for applied research and the fundamental sciences.  ...  As the first step on this path, the paper presents the Spark-MPI integrated platform connecting the Spark and MPI technologies for building data-intensive high-performance data processing pipelines for  ... 
arXiv:1805.04886v1 fatcat:3tdybjzdt5bordjpvwjvqpj5fa

mpi4py: Status Update After 12 Years of Development

Lisandro Dalcin, Yao-Lung Leo Fang
2021 Computing in science & engineering (Print)  
MPI for Python (mpi4py) has evolved to become the most used Python binding for the Message Passing Interface (MPI).  ...  We report on various improvements and features that mpi4py gradually accumulated over the past decade, including support up to the MPI-3.1 specification, support for CUDA-aware MPI implementations, and  ...  We thank Graham Markall for leading the community effort to revise and improve the CAI protocol.  ... 
doi:10.1109/mcse.2021.3083216 fatcat:i6tmfbxvfzbana25viweaw3gh4

Automatic MPI application transformation with ASPhALT

Anthony Danalis, Lori Pollock, Martin Swany
2007 2007 IEEE International Parallel and Distributed Processing Symposium  
In this paper we present asphalt transformer; the Open64-based component of our framework, ASPhALT, responsible for automatically performing the prepushing transformation.  ...  This paper describes a source to source compilation tool for optimizing MPI-based parallel applications.  ...  Acknowledgments We would like to thank the University of Tennessee and professor Dionisios G. Vlachos at the University of Delaware for providing us with access to their clusters.  ... 
doi:10.1109/ipdps.2007.370486 dblp:conf/ipps/DanalisPS07 fatcat:ll36g25sbravjhf3x2qxhpqr7q

D7.2.1 A Report on the Survey of HPC Tools and Techniques

Michael Lysaght, Bjorn Lindi, Vit Vondrak, John Donners, Marc Tajchman
2013 Zenodo  
he objective of PRACE-3IP Work Package 7 (WP7) 'Application Enabling and Support' is to provide applications enabling support for HPC applications codes which are important for European researchers to  ...  This deliverable contains a comprehensive survey of the research activity undertaken within PRACE to date so as to better understand what HPC tools and techniques have been developed that could be successfully  ...  Terascale Resources (British Supercomputer) HOPSA Holistic Performance System Analysis Project HPC High Performance Computing HPCS High Productivity Computing Systems HPCT High Performance Computing Toolkit  ... 
doi:10.5281/zenodo.6575492 fatcat:grwigpxd7naifbzo6w67w4glrm

Hardware Developments Ii

Liang Liang, Jony Castagna, Alan O'Cais, Simon Wong, Goar Sanchez
2017 Zenodo  
detailed feedback to the project software developers; - discussion of project software needs with hardware and software vendors, completion of survey of what is already available for particular hardware  ...  platforms; and, - detailed output from direct face-to-face session between the project endusers, developers and hardware vendors.  ...  platform MPI and MVAPICH).  ... 
doi:10.5281/zenodo.1207612 fatcat:p75hwqe5jjantcugbqrov7ryla

Interpreting Performance Data across Intuitive Domains

Martin Schulz, Joshua A. Levine, Peer-Timo Bremer, Todd Gamblin, Valerio Pascucci
2011 2011 International Conference on Parallel Processing  
We show that taking data from each of these domains and projecting, visualizing, and correlating it to the other domains can give valuable insights into the behavior of parallel application codes.  ...  , and in its communication behavior, and by doing so leads to an improved understanding of the performance of their codes.  ...  The four cores share a 2 MB L3 cache. Each node runs CHAOS 4, a high performance Linux variant based on RedHat Enterprise Linux. We rely on MVAPICH as our MPI implementation.  ... 
doi:10.1109/icpp.2011.60 dblp:conf/icpp/SchulzLBGP11 fatcat:n6un6wvbzngx3mhjanx4dru34e

Formosa3: A Cloud-Enabled Hpc Cluster In Nchc

Chin-Hung Li, Te-Ming Chen, Ying-Chuan Chen, Shuen-Tai Wang
2011 Zenodo  
We present initial work on the innovative integration of HPC batch system and virtualization tools that aims at coexistence such that they suffice for meeting the minimizing interference required by a  ...  traditional HPC cluster.  ...  In this cluster, InfiniBand network is responsible for both MPI and parallel file-system communication.  ... 
doi:10.5281/zenodo.1329114 fatcat:4svbh7gqp5du5bnncbc5eybcdy

Native Mode-Based Optimizations of Remote Memory Accesses in OpenSHMEM for Intel Xeon Phi

Naveen Namashivayam, Sayan Ghosh, Dounia Khaldi, Deepak Eachempati, Barbara Chapman
2014 Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models - PGAS '14  
OpenSHMEM is a PGAS library that aims to deliver high performance while retaining portability.  ...  We show the benefits of this approach on the PGAS-Microbenchmarks we specifically developed for this research.  ...  The authors acknowledge the Texas Advanced Computing Center (TACC) at the University of  ... 
doi:10.1145/2676870.2676881 dblp:conf/pgas/NamashivayamGKEC14 fatcat:nrrn5wksxfgqphco47ywlzjs44

Experiences Using Hybrid MPI/OpenMP in the Real World: Parallelization of a 3D CFD Solver for Multi-Core Node Clusters

Gabriele Jost, Bob Robins
2010 Scientific Programming  
Today most systems in high-performance computing (HPC) feature a hierarchical hardware design: shared-memory nodes with several multi-core CPUs are connected via a network infrastructure.  ...  We discuss performance, scalability and limitations of the pure MPI version of the code on a variety of hardware platforms and show how the hybrid approach can help to overcome certain limitations.  ...  Support was also provided by the DoD High Performance Computing Modernization Program (HPCMP), User Productivity Enhancement, Technology Transfer and Training (PETTT) program.  ... 
doi:10.1155/2010/523898 fatcat:mjxbtvxubndnnki42rezjon2eu

Toward Exascale Resilience

Franck Cappello, Al Geist, Bill Gropp, Laxmikant Kale, Bill Kramer, Marc Snir
2009 The international journal of high performance computing applications  
This set of projections leaves the community of fault tolerance for HPC system with a difficult challenge: finding new approaches, possibility radically disruptive, to run applications until their normal  ...  Over the past few years resilience has became a major issue for HPC systems, in particular in the perspective of large Petascale systems and future Exascale ones.  ...  The research methodology is not well established and shared across the HPC community.  ... 
doi:10.1177/1094342009347767 fatcat:s7i4a7aocnckzka4bxsyzbg6qi

Cosmological neutrino simulations at extreme scale

J. D. Emberson, Hao-Ran Yu, Derek Inman, Tong-Jie Zhang, Ue-Li Pen, Joachim Harnois-Déraps, Shuo Yuan, Huan-Yu Teng, Hong-Ming Zhu, Xuelei Chen, Zhi-Zhong Xing
2017 Research in Astronomy and Astrophysics  
We highlight code optimizations made to exploit modern high performance computing architectures and present a novel method of data compression that reduces the phase-space particle footprint from 24 bytes  ...  We incorporate neutrinos into the cosmological N-body code CUBEP3M and discuss the challenges associated with pushing to the extreme scales demanded by the neutrino problem.  ...  JDE, DI, and ULP gratefully acknowledge the support of the National Science and Engineering Research Council  ... 
doi:10.1088/1674-4527/17/8/85 fatcat:xk6pfcvhqfh6dphxjryxlfngye

D9.2.2: Final Software Evaluation Report

Jose Carlos, Guillaume Colin de Verdière, Matthieu Hautreux, Giannis Koutsou
2012 Zenodo  
The characteristics of these prototypes were selected in order to allow investigation into a number of key aspects relevant to high performance computing, namely interconnects, I/O, energy efficiency and  ...  This deliverable reports on the latest software developments in high performance computing, as identified by the PRACE-1IP, WP9 members.  ...  D9.2.2 Final Software Evaluation Report PRACE-1IP -RI-261557 23.07.2012 61 Recommendations on HPC tools This section identifies research on tools for High Performance Computing (HPC).  ... 
doi:10.5281/zenodo.6553027 fatcat:6vbrtqizm5eutmmskf44eltoqq

Automatic translation of MPI source into a latency-tolerant, data-driven form

Tan Nguyen, Pietro Cicotti, Eric Bylaska, Dan Quinlan, Scott Baden
2017 Journal of Parallel and Distributed Computing  
h i g h l i g h t s • Bamboo is a translator that can reformulate MPI source into a task graph form. • Bamboo supports both point-to-point and collective communication. • Bamboo supports GPUs, hiding communication  ...  The core message passing layer transforms a minimal subset of MPI point-to-point primitives, whereas the utility layer implements high-level routines by breaking them into their pointto-point components  ...  Bamboo transforms applications written in subset of MPI into a data-driven form that overlaps communication with computation automatically.  ... 
doi:10.1016/j.jpdc.2017.02.009 fatcat:ddyuex5tkjewfp46zeao4njtpa
« Previous Showing results 1 — 15 out of 37 results