7 Hits in 3.6 sec

Evaluating Lustre's Metadata Server on a Multi-Socket Platform

Konstantinos Chasapis, Manuel F. Dolz, Michael Kuhn, Thomas Ludwig
2014 2014 9th Parallel Data Storage Workshop  
We run our experiments on a four socket NUMA platform that has 48 cores.  ...  The results demonstrate that Lustre's metadata performance is limited to a single socket and decreases when more sockets are used.  ...  Multi-core and multi-socket platforms can be used to serve many metadata requests in parallel.  ... 
doi:10.1109/pdsw.2014.5 dblp:conf/sc/ChasapisDKL14 fatcat:bzpacbwyjfdkznlotrpc2wt6q4

Autotuned parallel I/O for highly scalable biosequence analysis

Haihang You, Bhanu Rekapalli, Qing Liu, Shirley Moore
2011 Proceedings of the 2011 TeraGrid Conference on Extreme Digital Discovery - TG '11  
Experimental results show linear speedup with increasing numbers of computing cores on a supercomputer, allowing the domain identification of millions of proteins in few minutes using hundreds of thousands  ...  Improving the performance and scalability of bioinformatics tools thus becomes a critical step in the quest to transform ever-growing raw genomics data into biological knowledge.  ...  Because during a full machine run each MPI process on a compute node still generates I/O traffic which causes overwhelming contention not only on Lustre's metadata server, but also on storage servers.  ... 
doi:10.1145/2016741.2016772 fatcat:njq3gkwi2rcoln3iqku5j4imdq

Software-defined QoS for I/O in exascale computing

Yusheng Hua, Xuanhua Shi, Hai Jin, Wei Liu, Yan Jiang, Yong Chen, Ligang He
2019 CCF Transactions on High Performance Computing  
Evaluation shows that SDQoS can effectively control the I/O bandwidth within a 5%-10% deviation and improve the performance by 20% in extreme cases.  ...  In this paper, we propose SDQoS, a software-defined QoS framework with the token bucket algorithm, aiming to meet the I/O requirements of concurrent applications contending for the I/O resources and improve  ...  This is because less GPUs per CPU means that one GPU can achieve higher bandwidth from a single socket.  ... 
doi:10.1007/s42514-019-00005-9 fatcat:c5xjiw3vvvdjrpef3xlv7x6xii

NORNS: Extending Slurm to Support Data-Driven Workflows through Asynchronous Data Staging

Alberto Miranda, Adrian Jackson, Tommaso Tocci, Iakovos Panourgias, Ramon Nou
2019 2019 IEEE International Conference on Cluster Computing (CLUSTER)  
Our evaluation shows that a workflow-aware Slurm exploits node-local storage more effectively, reducing the filesystem I/O contention and improving job running times.  ...  It also introduces a new service for asynchronous data staging called NORNS that coordinates with the job scheduler to orchestrate data transfers to achieve better resource utilization.  ...  data items are mapped to if there are multiple node-local resources (i.e. where there is a mount point per socket on a dual socket node).  ... 
doi:10.1109/cluster.2019.8891014 dblp:conf/cluster/MirandaJTPN19 fatcat:hc7nopkglze3pe3njvkwkzhusi

Accelerating Network Communication and I/O in Scientific High Performance Computing Environments

Sarah Marie Neuwirth
The frameworks are evaluated on the Titan supercomputing systems for three I/O interfaces.  ...  Originally driven through the increase of operating frequencies and technology scaling, a recent slowdown in this evolution has led to the development of multi-core architectures, which are supported by  ...  the target and evaluation platform.  ... 
doi:10.11588/heidok.00025757 fatcat:6slijzy6k5hr5iiw443gbuvwe4

Performance evaluation and comparison of NVMe memory and Lustre File System on CLAIX

Lukas Aldenhoff, Matthias Stefan Müller, Uwe Naumann
Furthermore there is 8 Object Storage Servers and one Metadata Server and one Metadata Target deployed.  ...  Lustre is an open-source platform focusing on performance and scalability with a client-server model, hence it is used on many big compute clusters.  ...  For example, previous studies have elaborated on software-side optimization to improve metadata performance [?] [?] .  ... 
doi:10.18154/rwth-2018-223515 fatcat:4ovmjpniwnbntgjusxhpidnjua

ClusterRAID: Architecture and Prototype of a Distributed Fault-Tolerant Mass Storage System for Clusters

Arne Wiebalck
The inherent unreliability of the underlying components is one of the reasons why no system has been established as a standard storage solution for clusters yet.  ...  The key concept of the presented system is the conversion of the local hard disk drive of a cluster node into a reliable device while preserving the block device interface.  ...  In the future Lustre's performance is planned to be enhanced by the implementation of sophisticated caching strategies and the use of distributed metadata servers in order to avoid bottlenecks for larger  ... 
doi:10.11588/heidok.00005624 fatcat:m7pr2byqczdltbhqfof3ho7mcy