35,326 Hits in 6.4 sec

Fast Parallel I/O on Cluster Computers [article]

Thomas Duessel, Norbert Eicker, Florin Isaila, Thomas Lippert, Thomas Moschny, Hartmut Neff, Klaus Schilling, Walter Tichy
2003 arXiv   pre-print
Today's cluster computers suffer from slow I/O, which slows down I/O-intensive applications.  ...  We show that fast disk I/O can be achieved by operating a parallel file system over fast networks such as Myrinet or Gigabit Ethernet.  ...  Acknowledgments This work was supported by the Deutsche Forschungsgemeinschaft as twinning project "Alpha-Linux-Cluster" (Ti264/6-1 & Li701/3-1).  ... 
arXiv:cs/0303016v1 fatcat:odppcpjyc5bv5axmnqe2mocheu

Fast electrostatic force calculation on parallel computer clusters

Amirali Kia, Daejoong Kim, Eric Darve
2008 Journal of Computational Physics  
Acknowledgment This work was supported in part by NSF award CNS-0619926 for computer resources.  ...  Assume we want to calculate a sum of the form: UðrÞ ¼ X N i¼1 q i jr À r i j The far-field is formally approximated by an expansion of the type: UðrÞ % X p k¼1 A k M k ðrÞ ð1Þ where A k ¼ X N i¼1 q i O  ...  The scalability was found to be excellent including on clusters with slow networks (or fast processors).  ... 
doi:10.1016/ fatcat:vcdf5tdgzzdq3h652wcy7atgqq

Single I/O space for scalable cluster computing

R.S.C. Ho, Kai Hwang, Hai Jin
1999 ICWC 99. IEEE Computer Society International Workshop on Cluster Computing  
While traditional approaches focused on at user-level or at distributed file subsystem level, we separate the I/O subsystem of a cluster into the file system and a set of distributed Virtual Device Drivers  ...  Compared to previous approaches, our approach has higher transparency, better performance, lower implementation cost, higher availability, and application compatibility for I/O intensive cluster computing  ...  Specifically, cluster computing demands a single I/O space in distributed, I/O intensive, operations.  ... 
doi:10.1109/iwcc.1999.810821 dblp:conf/iwcc/HoJH99 fatcat:rxgpengv7faejd52myigpjqyye

Exploiting data compression in collective I/O techniques

Rosa Filgueira, David E. Singh, Juan C. Pichel, Jesus Carretero
2008 2008 IEEE International Conference on Cluster Computing  
This paper presents Two-Phase Compressed I/O (TPC I/O,) an optimization of the Two-Phase collective I/O technique from ROMIO, the most popular MPI-IO implementation.  ...  Compared with Two-Phase I/O, Two-Phase Compressed I/O obtains important improvements in the overall execution time for many of the considered scenarios.  ...  Many parallel applications (especially related to simulations) consists of alternating compute and I/O phases. During the compute phase, the simulated process evolves to new states.  ... 
doi:10.1109/clustr.2008.4663811 dblp:conf/cluster/FilgueiraSPC08 fatcat:xgj6h6hrn5amlg43kgtozxmrji

Dynamic Model-Driven Parallel I/O Performance Tuning

Babak Behzad, Surendra Byna, Stefan M. Wild, Prabhat, Marc Snir
2015 2015 IEEE International Conference on Cluster Computing  
Parallel I/O performance depends highly on the interactions among multiple layers of the parallel I/O stack.  ...  Using this approach, we demonstrate 6X -94X speedup over default I/O time for different I/O kernels running on multiple HPC systems.  ...  The advantages of our proposed method include fast reduction of the search space compared to a GA approach and consideration of dynamic conditions of a parallel I/O subsystem.  ... 
doi:10.1109/cluster.2015.37 dblp:conf/cluster/BehzadBWPS15 fatcat:2e5sg4jbxvhu5ceclfxdinddxm

Paralleled Fast Search and Find of Density Peaks Clustering Algorithm on GPUs with CUDA

Mi Li, Jie Huang, Jingpeng Wang
2016 International Journal of Networked and Distributed Computing (IJNDC)  
Fast Search and Find of Density Peaks (FSFDP) is a newly proposed clustering algorithm that has already been successfully applied in many applications.  ...  Moreover, we evaluate our GPU-based implementation on GPU clusters of 9 nodes and compared to one GPU node, the program can achieve a further 7.55X speedup.  ...  Firstly, we evaluate the difference of the time cost on I/O and GPU kernel.  ... 
doi:10.2991/ijndc.2016.4.3.4 fatcat:pglkwezo6bb65boxkinbh4rlji

Noncontiguous I/O accesses through MPI-IO

A. Ching, A. Choudhary, K. Coloma, Wei-keng Liao, R. Ross, W. Gropp
2003 CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings.  
I/O performance remains a weakness of parallel computing systems today.  ...  A method of noncontiguous data access, list I/O, was recently implemented in the Parallel Virtual File System (PVFS). We implement support for this interface in the ROMIO MPI-IO implementation.  ...  subprogram of the Office of Advanced Scientific Computing Research, U.S.  ... 
doi:10.1109/ccgrid.2003.1199358 dblp:conf/ccgrid/ChingCCLRG03 fatcat:kz46kqmyprduli7rz4ruxizefe

Improving I/O Forwarding Throughput with Data Compression

Benjamin Welton, Dries Kimpe, Jason Cope, Christina M. Patrick, Kamil Iskra, Robert Ross
2011 2011 IEEE International Conference on Cluster Computing  
We studied the effect of the compression services on a variety of data sets and conducted experiments on a high-performance computing cluster.  ...  In this paper, we investigate improvements to I/O performance by exploiting this gap.  ...  We gratefully acknowledge the computing resources provided on Fusion, a 320-node computing cluster operated by the Laboratory Computing Resource Center at Argonne National Laboratory.  ... 
doi:10.1109/cluster.2011.80 dblp:conf/cluster/WeltonKCPIR11 fatcat:57ysrt6n7be3lgfrzvbudy5eh4

TRIO: Burst Buffer Based I/O Orchestration

Teng Wang, Sarp Oral, Michael Pritchard, Bin Wang, Weikuan Yu
2015 2015 IEEE International Conference on Cluster Computing  
However, directly writing the large and bursty checkpointing dataset to parallel file systems can incur significant I/O contention on storage servers.  ...  Recently burst buffers have been proposed as an intermediate layer to absorb the bursty I/O traffic from compute nodes to storage backend.  ...  This can be accomplished by partitioning storage servers into disjoint sets and assigning one arbitrator to orchestrate the I/O requests to each set.  ... 
doi:10.1109/cluster.2015.38 dblp:conf/cluster/WangOPWY15 fatcat:ordq4ic3rbdajbottv3yexkeke

I/O-Aware Batch Scheduling for Petascale Computing Systems

Zhou Zhou, Xu Yang, Dongfang Zhao, Paul Rich, Wei Tang, Jia Wang, Zhiling Lan
2015 2015 IEEE International Conference on Cluster Computing  
In this paper, we present a novel I/O-aware batch scheduling framework to coordinate ongoing I/O requests on petascale computing systems.  ...  Conventional approaches either focus on optimizing an application's access pattern individually or handle I/O requests on low-level storage layer without any knowledge from the upper-level applications  ...  k is the elapsed time from the start time of current I/O operation to now and time spent on the j-th computation/communication operation and T I/O i,j the time of the j-th I/O operation without I/O congestion  ... 
doi:10.1109/cluster.2015.45 dblp:conf/cluster/ZhouYZRTWL15 fatcat:p7orvdwhlvc4ti4grlizgalg6m

Improving Parallel I/O Performance with Data Layout Awareness

Yong Chen, Xian-He Sun, Rajeev Thakur, Huaiming Song, Hui Jin
2010 2010 IEEE International Conference on Cluster Computing  
The experimental results verify that the proposed strategy could improve parallel I/O performance by nearly 40% on average.  ...  The poor I/O performance has been attributed as a critical cause of the low sustained performance of parallel computing systems.  ...  We performed a series of tests on the Sun Fire cluster to compare the performance of layout-aware collective I/O and the original one.  ... 
doi:10.1109/cluster.2010.35 dblp:conf/cluster/ChenSTSJ10 fatcat:rf4g26ga6neaxnclbohms5s3qy

SEMPLAR: high-performance remote parallel I/O over SRB

N. Ali, M. Lauria
2005 CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005.  
We have provided I/O performance results for a high-performance computing workload on three different clusters.  ...  In this paper we describe SEMPLAR, a library for remote, parallel I/O that combines the standard programming interface of MPI-IO with the remote storage functionality of the SDSC Storage Resource Broker  ...  We are especially indebted to Henri Bal of Vrije Universiteit, Amsterdam for giving us access to the DAS-2 cluster.  ... 
doi:10.1109/ccgrid.2005.1558578 dblp:conf/ccgrid/AliL05 fatcat:vok6x75bwzfibjxjnyoazuhexe

H5hut: A high-performance I/O library for particle-based simulations

Mark Howison, Andreas Adelmann, E. Wes Bethel, Achim Gsell, Benedikt Oswald, Prabhat
2010 2010 IEEE International Conference On Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS)  
I/O and data management expertise.  ...  Achieving high-performance I/O for this data, effectively managing it on disk, and interfacing it with analysis and visualization tools can be challenging, especially for domain scientists who do not have  ...  Both the Parallel Log-structured File System (PLFS) [9] and the Adaptable I/O System (ADIOS) [10] address the performance issues of parallel I/O on large HPC systems.  ... 
doi:10.1109/clusterwksp.2010.5613098 fatcat:zol4d2ltxzbqldooe3s25ew2xm

View-Based Collective I/O for MPI-IO

Javier Garc Blas, Florin Isaila, David E. Singh, J. Carretero
2008 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID)  
Eighth IEEE International Symposium on Cluster Computing and the Grid 978-0-7695-3156-4/08 $25.00  ...  The evaluation section shows that view-based I/O outperforms the original two-phase collective I/O from ROMIO in most of the cases for three well-known parallel I/O benchmarks.  ...  ACKNOWLEDGMENT We are grateful to the High Performance Computing Center Stuttgart (HLRS) for the offered support , especially to Rainer Keller and Alexander Schulz.  ... 
doi:10.1109/ccgrid.2008.85 dblp:conf/ccgrid/BlasISC08 fatcat:d4kajjhz7vby7e57zglm6ixlma

Load Balancing using Grid-based Peer-to-Peer Parallel I/O

Yijian Wang, David Kaeli
2005 Proceedings IEEE International Conference on Cluster Computing  
In order to overcome I/O bottlenecks and to increase I/O parallelism, data streams need to be parallelized at both the application level and the storage device level.  ...  Next, we describe a profile-guided data allocation algorithm that can increase the degree of I/O parallelism present in the system, as well as to balance I/O in a heterogeneous system.  ...  In [6] , Abawajy addresses the problem of effective management of parallel I/O in cluster computing systems by using two new I/O scheduling algorithms.  ... 
doi:10.1109/clustr.2005.347040 dblp:conf/cluster/WangK05 fatcat:zretqxaygffubalxp6n6llactq
« Previous Showing results 1 — 15 out of 35,326 results