34 Hits in 10.8 sec

Improving Parallel I/O Performance Using Multithreaded Two-Phase I/O with Processor Affinity Management [chapter]

Yuichi Tsujita, Kazumi Yoshinaga, Atsushi Hori, Mikiko Sato, Mitaro Namiki, Yutaka Ishikawa
2014 Lecture Notes in Computer Science  
such as OpenMPI  Supports many parallel file systems such as Lustre or PVFS2 through an ADIO interface layer x Motivation (2)  Our proposal • Multithreaded Two-Phase I/O by using a Pthreads library  ...  Collective I/O • High throughput by using parallel I/O • Suitable for parallel file systems such as Lustre Data transfer MPI-IO (3) Software stack of ROMIO ROMIO Parallel I/O in upper layer  ... 
doi:10.1007/978-3-642-55224-3_67 fatcat:cjazxneiuned3ng2f5dw7cjp2m

TAPIOCA: An I/O Library for Optimized Topology-Aware Data Aggregation on Large-Scale Supercomputers

Francois Tessier, Venkatram Vishwanath, Emmanuel Jeannot
2017 2017 IEEE International Conference on Cluster Computing (CLUSTER)  
On both architectures, we show a substantial improvement of I/O performance compared with the default MPI I/O implementation.  ...  On BG/Q+GPFS, for instance, our algorithm leads to a performance improvement by a factor of twelve while on the Cray XC40 system associated with a Lustre filesystem, we achieve an improvement of four.  ...  This research is partially supported by the NCSA-Inria-ANL-BSC-JSC-Riken Joint-Laboratory on Extreme Scale Computing (JLESC).  ... 
doi:10.1109/cluster.2017.80 dblp:conf/cluster/TessierVJ17 fatcat:kssyr2khd5ckrbb5yvou3x7w3q

Using Object Based Files for High Performance Parallel I/O

Jeremy Logan, Phillip M. Dickens
2007 2007 4th IEEE Workshop on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications  
We contend that the scalable I/O problem in high performance computing is largely due to the legacy view of a file as a linear sequence of bytes.  ...  We analyze the performance of our system using the FLASH-IO benchmark, and demonstrate a substantial performance improvement over the standard ROMIO implementation.  ...  Much like two-phase I/O, a subset of the processors are used as aggregators to arrange object data in file order, allowing disk writes to be efficiently performed on large contiguous blocks. VI.  ... 
doi:10.1109/idaacs.2007.4488394 fatcat:b436j53h55hvtpzxy7aw24li2i

ISOBAR hybrid compression-I/O interleaving for large-scale parallel I/O optimization

Eric R. Schendel, Scott Klasky, Robert Ross, Nagiza F. Samatova, Saurabh V. Pendse, John Jenkins, David A. Boyuka, Zhenhuan Gong, Sriram Lakshminarasimhan, Qing Liu, Hemanth Kolla, Jackie Chen
2012 Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing - HPDC '12  
In this paper, we propose a hybrid framework for interleaving I/O with data compression to achieve improved I/O throughput side-by-side with reduced dataset size.  ...  Current peta-scale data analytics frameworks suffer from a significant performance bottleneck due to an imbalance between their enormous computational power and limited I/O bandwidth.  ...  Depending on the transport method chosen in the configuration file, ADIOS can store data from all the writing processes into a single shared file using collective MPI-IO or multiple files using POSIX I  ... 
doi:10.1145/2287076.2287086 dblp:conf/hpdc/SchendelPJBBGLLKCKRS12 fatcat:ejt432xn4jasfkrfwis4gbktfy

Scalable I/O forwarding framework for high-performance computing systems

Nawab Ali, Philip Carns, Kamil Iskra, Dries Kimpe, Samuel Lang, Robert Latham, Robert Ross, Lee Ward, P. Sadayappan
2009 2009 IEEE International Conference on Cluster Computing and Workshops  
The I/O nodes perform operations on behalf of the compute nodes and can reduce file system traffic by aggregating, rescheduling, and caching I/O requests.  ...  This paper presents an open, scalable I/O forwarding framework for high-performance computing systems.  ...  Applications using MPI-IO need to translate the MPI-IO calls to POSIX I/O, which eliminates the possible use of file system specific optimizations performed at the MPI-IO layer.  ... 
doi:10.1109/clustr.2009.5289188 dblp:conf/cluster/AliCIKLLRWS09 fatcat:imcs7pkqhza7dpfsckol446fsu

I/O threads to reduce checkpoint blocking for an electromagnetics solver on Blue Gene/P and Cray XK6

Jing Fu, Robert Latham, Misun Min, Christopher D. Carothers
2012 Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers - ROSS '12  
and a tuned MPI-IO collective approach (coIO).  ...  We discuss an I/O-thread based, application-level, two-phase I/O approach, called "threaded reduced-blocking I/O" (threaded rbIO), and compare it with a regular version of "reduced-blocking I/O" (rbIO)  ...  In particular, we compare the application-level, two-phase reduced blocking I/O (rbIO) that uses an I/O thread (POSIX thread), with the regular rbIO and a well-tuned MPI-IO collective I/O approach (coIO  ... 
doi:10.1145/2318916.2318919 fatcat:bdm47bhdrvaghfwyqpbtnox4jq

Topology-Aware Strategy for MPI-IO Operations in Clusters

Weifeng Liu, Jie Zhou, Meng Guo
2018 Journal of Optimization  
This paper presents the topology-aware two-phase I/O (TATP), which optimizes the most popular collective MPI-IO implementation of ROMIO.  ...  In most of the considered scenarios, topology-aware two-phase I/O obtains important improvements when compared with the original two-phase I/O implementations.  ...  We thank National Supercomputer Center in Jinan (NSCCJN) and Shandong Province High Performance Computing Center (SDHPCC) for providing experiment environment.  ... 
doi:10.1155/2018/2068490 fatcat:xupnrjs67jdo7nugvegrdlzv3y

Multiple-Level MPI File Write-Back and Prefetching for Blue Gene Systems [chapter]

Javier García Blas, Florin Isailă, J. Carretero, Robert Latham, Robert Ross
2009 Lecture Notes in Computer Science  
The experimental results demonstrate that both solutions achieve high performance through a high degree of overlap between computation, communication, and file I/O.  ...  We describe and evaluate a two-level file write-back implementation and a one-level prefetching solution.  ...  An implementation of MPI-IO for Cray architecture and the Lustre file system is described in [10] .  ... 
doi:10.1007/978-3-642-03770-2_23 fatcat:zjpiuv4xbrcwrgjxwfiovqf4le

SCALER: Scalable parallel file write in HDFS

Xi Yang, Yanlong Yin, Hui Jin, Xian-He Sun
2014 2014 IEEE International Conference on Cluster Computing (CLUSTER)  
Two camps of file systems exist: parallel file systems designed for conventional high performance computing (HPC) and distributed file systems designed for newly emerged dataintensive applications.  ...  This study introduces a system solution, named SCALER, which allows MPI based applications to directly access HDFS without extra data movement.  ...  Furthermore, the SCALER design adopts a client-side collective I/O approach that is similar to the Collective I/O scheme in MPI-IO.  ... 
doi:10.1109/cluster.2014.6968736 dblp:conf/cluster/YangYJS14 fatcat:h3r7eawslzc45eo5alhdevm2ea

Capturing inter-application interference on clusters

Aamer Shah, Felix Wolf, Sergey Zhumatiy, Vladimir Voevodin
2013 2013 IEEE International Conference on Cluster Computing (CLUSTER)  
Unfortunately, traditional performance-analysis techniques consider an application always in isolation, without the ability to compare its performance to the overall performance conditions on the system  ...  Cluster systems usually run several applicationsoften from different users-concurrently, with individual applications competing for access to shared resources such as the file system or the network.  ...  All our file I/O was performed on a Lustre file system with four metadata servers (MDS) of type Bull NovaScale R423-E2 (two Nehalem-EP quad-core & two Westmere-EP, 6-core) and eight object storage (OST  ... 
doi:10.1109/cluster.2013.6702665 dblp:conf/cluster/ShahWZV13 fatcat:3bvs5pffm5hehg37wphekugnce

PaKman: Scalable Assembly of Large Genomes on Distributed Memory Machines [article]

Priyanka Ghosh, Sriram Krishnamoorthy, Ananth Kalyanaraman
2019 bioRxiv   pre-print
Our approach focuses on improving performance through a combination of novel data structures and algorithmic strategies for reducing the communication and I/O footprint during the assembly process.  ...  , and irregular access footprints of memory and I/O operations.  ...  While we primarily focus our comparative evaluation on the Lustre file system, we detail PaKman's I/O behavior on all three file systems.  ... 
doi:10.1101/523068 fatcat:4tfsylflybcttozouex4sraxlu

Efficient Software for Archiving and Retrieving Results of Massive Bioinformatics Analyses in High-Performance Computing Environments

Craig P Steffen, Roland Haas, Katherine Kendig, Liudmila Mainzer, Ryan Chui, Christina Fliege
2021 Zenodo  
The present manuscript reviews several recently developed parallel alternatives, showcasing their performance on a variety of high performance computing systems.  ...  Parallel file systems, such as Lustre, GPFS, and tape archives, can perform poorly under these circumstances due to overabundance of metadata.  ...  XSEDE allocation TG-ASC170008, "Testing and Performance Measuring of parfu and PTGZ File Archive tools for Bioinformatics and other Big Data Fields".  ... 
doi:10.5281/zenodo.5805629 fatcat:avnfbxcjzvew3p54dui3yxzz7i


Kamil Iskra, John W. Romein, Kazutomo Yoshii, Pete Beckman
2008 Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming - PPoPP '08  
Through the use of optimized network protocols and data paths, as well as a multithreaded daemon running on I/O nodes, ZOID provides greater performance than does the stock infrastructure.  ...  In this paper, we introduce a component of ZeptoOS called ZOIDan I/O-forwarding infrastructure for architectures such as IBM Blue Gene that decouple file and socket I/O from the compute nodes, shipping  ...  Figure 5 . 5 NFS file I/O performance: read (left) and write (right). Figure 4 . 4 Base collective network performance. Figure 6 . 6 PVFS file I/O performance: read (left) and write (right).  ... 
doi:10.1145/1345206.1345230 dblp:conf/ppopp/IskraRYB08 fatcat:w3adzkidxrbuhnc4dzk2wo2ooe

Evaluating the Benefits of Key-Value Databases for Scientific Applications [chapter]

Pol Santamaria, Lena Oden, Eloy Gil, Yolanda Becerra, Raül Sirvent, Philipp Glock, Jordi Torres
2019 Lecture Notes in Computer Science  
Given the computing needs, we study the effects of replacing a traditional storage system with a distributed Key-Value database on a cell segmentation application.  ...  The original code uses HDF5 files on GPFS through an intricate interface, imposing synchronizations.  ...  MPI-IO allows parallel non-contiguous writes in HDF5 but requires a POSIX-compliant file system.  ... 
doi:10.1007/978-3-030-22734-0_30 fatcat:vjlt4pqsona4njuxdob64fgg4a


Md. Mostofa Ali Patwary, Pradeep Dubey, Suren Byna, Nadathur Rajagopalan Satish, Narayanan Sundaram, Zarija Lukić, Vadim Roytershteyn, Michael J. Anderson, Yushu Yao, Prabhat
2015 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15  
Our system, called BD-CATS, is the first one capable of performing end-to-end analysis at trillion particle scale (including: loading the data, geometric partitioning, computing kd-trees, performing clustering  ...  Summarizing and analyzing raw particle data is challenging, and scientists often focus on density structures, whether in the real 3D space, or a high-dimensional phase space.  ...  Identical to reading, we use MPI-IO in collective I/O mode to use a small number of aggregators to interact with the file system and hence achieve similar behavior.  ... 
doi:10.1145/2807591.2807616 dblp:conf/sc/PatwaryBSSLRAYP15 fatcat:xyxhtzg22bex5nch6aa5fbbo7u
« Previous Showing results 1 — 15 out of 34 results