11,416 Hits in 7.6 sec

Scalable Critical-Path Based Performance Analysis

David Bohme, Felix Wolf, Bronis R. de Supinski, Martin Schulz, Markus Geimer
2012 2012 IEEE 26th International Parallel and Distributed Processing Symposium  
, such as identifying load imbalance, quantifying the impact of imbalance on runtime, and characterizing resource consumption.  ...  By replaying event traces in parallel, we can calculate these performance indicators in a highly scalable way, making them a suitable analysis instrument for massively parallel programs with thousands  ...  do, making it a highly valuable tool to assess load balance and to detect parallelization bottlenecks.  ... 
doi:10.1109/ipdps.2012.120 dblp:conf/ipps/BohmeWSSG12 fatcat:wh7prtjjfredpajnlv7sbm3g7m

A practical scheduling scheme for non-uniform parallel loops on distributed memory parallel machines

Tong-Yee Lee, C.S. Raghavendra, H. Sivaraman
1996 Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences  
Our experimental results show that GDC performs well on many applications with a'iflerent characteri&ics.  ...  In this paper, we present a global distributed control scheme (GDC) to schedule nonuniform loops on distributed memory parallel machines.  ...  Acknowledgment This research was performed in part using the Intel Touchstone Delta System operated by California Inetitute of Technology on behalf of the Concurrent Supercomputing Consortium.  ... 
doi:10.1109/hicss.1996.495468 dblp:conf/hicss/LeeRS96 fatcat:muk5izqdgndd7eq47cgpm2rkui

Executing association rule mining algorithms under a Grid computing environment

Raja Tlili, Yahya Slimani
2011 Proceedings of the Workshop on Parallel and Distributed Systems Testing, Analysis, and Debugging - PADTAD '11  
This load imbalance is due to the dynamic nature of data mining algorithms (i.e. we cannot predict the load before execution) and the heterogeneity of Grid computing systems.  ...  A Grid infrastructure distributed in nine sites around France, for research in large-scale parallel and distributed systems.  ...  The decision phase is triggered when the load imbalance is detected to calculate optimal data redistribution.  ... 
doi:10.1145/2002962.2002973 dblp:conf/issta/TliliS11 fatcat:igb2rbh2yjgizhsaigjhms2hqu

Layout-aware I/O Scheduling for terabits data movement

Youngjae Kim, Scott Atchley, Geoffroy R. Vallee, Galen M. Shipman
2013 2013 IEEE International Conference on Big Data  
These competing uses often induce temporary, but significant, I/O load imbalances on the storage system, which impact the performance of all the users.  ...  Second, we present I/O optimization solutions with layout-awareness on end-system hosts for bulk data movement.  ...  Parallel file systems have been a widely adapted solution for scientific applications to support both high performance I/O and large data sets.  ... 
doi:10.1109/bigdata.2013.6691661 dblp:conf/bigdataconf/KimAVS13 fatcat:6h4jp5lloraplpg5wvuncpfhm4

A review of measurement and analysis of electric power quality on shipboard power system networks

Julio Barros, Ramón I. Diego
2016 Renewable & Sustainable Energy Reviews  
Voltage and frequency fluctuations, voltage dips and swells, transients and voltage notching, fault detection and classification, harmonic distortion and voltage imbalance are reviewed and discussed.  ...  Electric power quality is an important aspect of increasing concern in power system networks in ships.  ...  An important aspect of shipboard power systems is the massive use of non-linear loads in comparison to power system network on land.  ... 
doi:10.1016/j.rser.2016.05.043 fatcat:4hkgibuvgrfxtejkm7oukim7ua

High-Performance Massive Subgraph Counting using Pipelined Adaptive-Group Communication [article]

Langshi Chen, Bo Peng, Sabra Ossen, Anil Vullikanti, Madhav Marathe, Lei Jiang, Judy Qiu
2018 arXiv   pre-print
Recent applications have motivated solving such problems on massive networks with billions of vertices. In this chapter, we study the subgraph counting problem from a parallel computing perspective.  ...  We then present several system-level strategies to substantially improve the overall performance of the algorithm in massive subgraph counting problems.  ...  CIF-DIBBS 143054: Middleware and High Performance Analytics Libraries for Scalable Data Science, NSF EAGRER grant, NSF Bigdata grant and DTRA CNIMS grant.  ... 
arXiv:1804.09764v1 fatcat:2vqvcohjf5fsjdoj3ch5xlhjzm

Supporting load balancing for distributed data-intensive applications

Leonid Glimcher, Vignesh T. Ravi, Gagan Agrawal
2009 2009 International Conference on High Performance Computing (HiPC)  
We have developed a load balancing algorithm, which minimizes the total time spent on processing the data.  ...  We have extensively evaluated our techniques using two data-intensive applications.  ...  and data transfer factors. 3) Evaluating the performance of load balancing system on a real WAN setting. 4) Evaluating the scalability of k-means and vortex detection applications with dynamic load balancing  ... 
doi:10.1109/hipc.2009.5433204 dblp:conf/hipc/GlimcherRA09 fatcat:2g5p5oi4lvaftdetdelwvr2ptm

Dynamic Load Balancing in GPU-Based Systems - Early Experiments [article]

Alvaro Luiz Fazenda, Celso L. Mendes, Laxmikant V. Kale, Jairo Panetta, Eduardo Rocha Rodrigues
2013 arXiv   pre-print
However, the use of GPUs to improve computational performance is quickly getting massively disseminated in the high-performance computing community.  ...  This paper aims to investigate how the same Charm++/AMPI framework can be extended to balance load in a synthetic application inspired by the BRAMS numerical forecast model, running mostly on GPUs rather  ...  INTRODUCTION According to Jack Dongarra [1] , load imbalance is one of the major problems to be handled to improve the parallel performance of applications.  ... 
arXiv:1310.4218v1 fatcat:gguynfmsnvf4xopfzru7xtyuey

Task Packing: Efficient task scheduling in unbalanced parallel programs to maximize CPU utilization

Gladys Utrera, Montse Farreras, Jordi Fornes
2019 Journal of Parallel and Distributed Computing  
Load imbalance in parallel systems can be generated by external factors to the currently running applications like operating system noise or the underlying hardware like a heterogeneous cluster.  ...  HPC applications working on irregular data structures can also have difficulties to balance their computations across the parallel tasks.  ...  Not only load imbalance results in the application losing performance but also prevents an efficient use of the High Performance Computing (HPC) system as a whole, wasting CPU cycles and ultimately wasting  ... 
doi:10.1016/j.jpdc.2019.08.003 fatcat:7ewbqhpv45ggfes2big4klmwwa

A Big Data Analysis on Distributed File Storage System

end users.  ...  To find a better development, researchers concentrated on Big Data Analysis (BDA), but the traditional databases, data techniques and platforms suffers from storage, imbalance data, scalability, insufficient  ...  There are four types of models presents in BDA such as in-memory models, MapReduce (MR)-based systems, Massively Parallel Processing (MPP) systems and Bulk Synchronous Parallel (BSP) systems.  ... 
doi:10.35940/ijitee.b6427.129219 fatcat:ar5zzzrzyzbafnbyk4sprgqcly

Converting massive TLP to DLP

Tirath Ramdas, Gregory K. Egan, David Abramson, Kim Baldridge
2007 Proceedings of the 4th international conference on Computing frontiers - CF '07  
This lack of uniformity limits the level of data-level parallelism (DLP) inherent in the application, thus apparently rendering a SIMD architecture unfeasible.  ...  All ERIs may be computed in parallel, therefore there is much thread-level parallelism (TLP).  ...  By converting TLP to DLP we effectively eliminate fine-grained imbalances which could have a highly beneficial impact on overall parallel system performance, though load-balancing implications are outside  ... 
doi:10.1145/1242531.1242570 dblp:conf/cf/RamdasEAB07 fatcat:cce6a3l22rdk7fis3vkncyeheu

Massively parallel genomic sequence search on the Blue Gene/P architecture

Heshan Lin, Pavan Balaji, Ruth Poole, Carlos Sosa, Xiaosong Ma, Wu-chun Feng
2008 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis  
Consequently, we propose and study different approaches for mapping sequence-search and parallel I/O tasks on such massively parallel architectures.  ...  This paper presents our first experiences in mapping and optimizing genomic sequence search onto the massively parallel IBM Blue Gene/P (BG/P) platform.  ...  On an individual basis, we are grateful to Carl Obert and John Thomas for their support and to Jeremy Archuleta whose feedback helped improve the presentation of the paper.  ... 
doi:10.1109/sc.2008.5222005 dblp:conf/sc/LinBPSMF08 fatcat:pwjowi5w6ba5vnnbupejeejm4a

Using Load Balancing to Scalably Parallelize Sampling-Based Motion Planning Algorithms

Adam Fidel, Sam Ade Jacobs, Shishir Sharma, Nancy M. Amato, Lawrence Rauchwerger
2014 2014 IEEE 28th International Parallel and Distributed Processing Symposium  
However, such methods are prone to load imbalance, as planning time depends on region characteristics and, for most problems, the heterogeneity of the subproblems increases as the number of processors  ...  in a more scalable and load-balanced computation on more than 3,000 cores. † Parasol Lab,  ...  [2] provide a parallel GPU-based RRT and RRT * by focusing on parallelizing the collision detection phase. A more recent work focused on multicore architectures [11] .  ... 
doi:10.1109/ipdps.2014.66 dblp:conf/ipps/FidelJSAR14 fatcat:h34hmau4wragbjotz3zi3lbdxe

Identifying the Root Causes of Wait States in Large-Scale Parallel Applications

David Bohme, Markus Geimer, Felix Wolf, Lukas Arnold
2010 2010 39th International Conference on Parallel Processing  
However, load or communication imbalance prevents many codes from taking advantage of the available parallelism, as delays of single processes may spread wait states across the entire machine.  ...  Driven by growing application requirements and accelerated by current trends in microprocessor design, the number of processor cores on modern supercomputers is increasing from generation to generation  ...  on massively parallel systems.  ... 
doi:10.1109/icpp.2010.18 dblp:conf/icpp/BohmeGWA10 fatcat:xpdljbzbtbht7oxnn5tifrsn2i

Identifying the Root Causes of Wait States in Large-Scale Parallel Applications

David Böhme, Markus Geimer, Lukas Arnold, Felix Voigtlaender, Felix Wolf
2016 ACM Transactions on Parallel Computing  
However, load or communication imbalance prevents many codes from taking advantage of the available parallelism, as delays of single processes may spread wait states across the entire machine.  ...  Driven by growing application requirements and accelerated by current trends in microprocessor design, the number of processor cores on modern supercomputers is increasing from generation to generation  ...  on massively parallel systems.  ... 
doi:10.1145/2934661 fatcat:gbmkjpkg5ncn7pvlrpq6d633ai
« Previous Showing results 1 — 15 out of 11,416 results