Filters








610 Hits in 7.4 sec

PVFS over InfiniBand: design and performance evaluation

J. Wu, P. Wyckoff, Dhabaleswar Panda
2003 2003 International Conference on Parallel Processing, 2003. Proceedings.  
I/O is quickly emerging as the main bottleneck limiting performance in modern day clusters. The need for scalable parallel I/O and file systems is becoming more and more urgent.  ...  We use Parallel Virtual File System (PVFS) as a basis for exploring these features.  ...  Kini form our research group for many discussions with us.  ... 
doi:10.1109/icpp.2003.1240573 dblp:conf/icpp/WuWP03 fatcat:cdy2wvjzargfjeiuzln32vv37q

High performance support of parallel virtual file system (PVFS2) over Quadrics

Weikuan Yu, Shuang Liang, Dhabaleswar K. Panda
2005 Proceedings of the 19th annual international conference on Supercomputing - ICS '05  
In this paper, we explore the challenges of supporting parallel file system with modern features of Quadrics, including user-level communication and RDMA operations.  ...  Quadrics QDMA and RDMA mechanisms are integrated and optimized for high performance data communication.  ...  Jiesheng Wu from Ask Jeeves, Inc for many technical discussions. We would like to thank members from the PVFS2 team for their technical help.  ... 
doi:10.1145/1088149.1088192 dblp:conf/ics/YuLP05 fatcat:mpv3r6hmjzd4ben2zioklh2zs4

Fast and Concurrent RDF Queries with RDMA-Based Distributed Graph Exploration

Jiaxin Shi, Youyang Yao, Rong Chen, Haibo Chen, Feifei Li
2016 USENIX Symposium on Operating Systems Design and Implementation  
Evaluation on a 6-node RDMA-capable cluster shows that Wukong significantly outperforms state-of-the-art systems like TriAD and Trinity.RDF for both latency and throughput, usually at the scale of orders  ...  the low latency and high throughput of onesided RDMA operations, and proposes a worker-obliger model for efficient load balancing.  ...  Wukong significantly outperforms state-of-the-art systems and can process a mixture of small and large queries at 269K queries/second on a 6-node RDMA-capable cluster.  ... 
dblp:conf/osdi/ShiYCCL16 fatcat:iszwak734zag5klwqpviy3vyp4

An RDMA Middleware for Asynchronous Multi-stage Shuffling in Analytical Processing [chapter]

Rui C. Gonçalves, José Pereira, Ricardo Jiménez-Peris
2016 Lecture Notes in Computer Science  
In this paper we describe the design and implementation of a communication middleware to support data shuffling for executing multi-stage analytical processing operations in parallel.  ...  Experimental results show that the RDMAbased middleware developed can provide a 75 % reduction of the costs of communication operations on parallel analytical processing tasks, when compared with a sockets  ...  Competitiveness and Internationalisation -COMPETE 2020 Programme and by National Funds through the FCT -Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) within  ... 
doi:10.1007/978-3-319-39577-7_5 fatcat:pa7evaovdfdlpmrbpjdg67rhfa

Design and testbed evaluation of RDMA-based middleware for high-performance data transfer applications

Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi
2013 Journal of Systems and Software  
We design a middleware layer of high-speed communication based on Remote Direct Memory Access (RDMA) that serves as the common substrate to accelerate various data transfer tools, such as FTP, HTTP, file  ...  We provide a reference implementation of the popular file-transfer protocol over this RDMA-based middleware layer, called RFTP.  ...  This block is marked as "waiting" state. At this point, the data sink is waiting for the remote side to fill this block using RDMA one-sided operation.  ... 
doi:10.1016/j.jss.2013.01.070 fatcat:suqi7ayvtjeidox4e6u7kfztuy

High performance virtual machine migration with RDMA over modern interconnects

Wei Huang, Qi Gao, Jiuxing Liu, Dhabaleswar K. Panda
2007 2007 IEEE International Conference on Cluster Computing  
As a basis for many administration tools in modern clusters and data-centers, VM migration is desired to be extremely efficient to reduce both migration time and performance impact on hosted applications  ...  The evaluations using our prototype implementation over Xen and InfiniBand show that RDMA can drastically reduce the migration overhead: up to 80% on total migration time and up to 77% on application observed  ...  OS-bypass allows data communication to be directly initiated from process user space; on top of that, RDMA allows direct data movement from the memory of one computer into that of another.  ... 
doi:10.1109/clustr.2007.4629212 dblp:conf/cluster/HuangGLP07 fatcat:ogis52myzvcydjig7ht47okmq4

Designing Efficient Systems Services and Primitives for Next-Generation Data-Centers

K. Vaidyanathan, S. Narravula, P. Balaji, D.K. Panda
2007 2007 IEEE International Parallel and Distributed Processing Symposium  
Current data-centers lack in efficient support for intelligent services, such as requirements for caching documents and cooperation of caching servers, efficiently monitoring and managing the limited physical  ...  ., RDMA, atomic operations).  ...  Current data-centers rely on TCP/IP for communication even within the cluster.  ... 
doi:10.1109/ipdps.2007.370507 dblp:conf/ipps/VaidyanathanNBP07 fatcat:xgvjpc3qjzh2fep5d6fukslupm

Scalable Work Stealing of Native Threads on an x86-64 Infiniband Cluster

Shigeki Akiyama, Kenjiro Taura
2016 Journal of Information Processing  
One-sided work stealing is a popular approach to achieving high efficiency of load balancing; therefore this also limits scalability of distributed memory task parallelism.  ...  We develop one-sided and non one-sided implementations of inter-node work stealing, and evaluate the performance and efficiency of the work stealing implementations.  ...  The calculations were carried out on the TSUBAME2.5 supercomputer in the Tokyo Institute of Technology.  ... 
doi:10.2197/ipsjjip.24.583 fatcat:lrsku2xz4feadb74xmlwzjqvrq

Supporting efficient noncontiguous access in PVFS over Infiniband

Jiseheng Wu, Wyckoff, Panda
2003 Proceedings IEEE International Conference on Cluster Computing CLUSTR-03  
We propose a novel approach, RDMA Gather/Scatter, to transfer noncontiguous data for such I/O accesses.  ...  This characteristic imposes a requirement of native noncontiguous I/O access support in cluster file systems for high performance.  ...  We are also thankful to Jiuxing Liu and Pavan Balaji for discussion with us.  ... 
doi:10.1109/clustr.2003.1253333 dblp:conf/cluster/WuWP03 fatcat:zesr2ucmyban7e3fmo6jegahqy

Balancing CPU and Network in the Cell Distributed B-Tree Store

Christopher Mitchell, Kate Montgomery, Lamont Nelson, Siddhartha Sen, Jinyang Li
2016 USENIX Annual Technical Conference  
Our evaluation on a large RDMA-capable cluster show that Cell scales well and that its dynamic selector effectively responds to resource availability and workload properties.  ...  Within each fat node, Cell organizes keys as a local B-tree of RDMA-friendly small nodes for client-side searches.  ...  Acknowledgments We would like to thank our shepherd, Peter Pietzuch, for his guidance and helpful suggestions.  ... 
dblp:conf/usenix/MitchellMNSL16 fatcat:elgmqaerebfczer324wltjjng4

Design and Implementation of MPICH2 over InfiniBand with RDMA Support [article]

Jiuxing Liu, Weihang Jiang, Pete Wyckoff, Dhabaleswar K. Panda, David Ashton, Darius Buntinas, William Gropp, Brian Toonen
2003 arXiv   pre-print
For several years, MPI has been the de facto standard for writing parallel applications. One of the most popular MPI implementations is MPICH.  ...  We have based our design on the RDMA Channel interface provided by MPICH2, which encapsulates architecture-dependent communication functionalities into a very small set of functions.  ...  Other protocols on InfiniBand need efficient processing, including one-sided communication in MPI-2, DSM systems, and parallel file systems.  ... 
arXiv:cs/0310059v1 fatcat:tkaemmkasjbcteqgxl7veggily

Server I/O networks past, present, and future

Renato John Recio
2003 Proceedings of the ACM SIGCOMM workshop on Network-I/O convergence experience, lessons, implications - NICELI '03  
Recently several technologies have emerged that enable a single interconnect to be used as more than one fabric type.  ...  enabling network convergence; and how these new technologies are being deployed on various network families.  ...  This section will explore these two inhibitors in more detail. 7.1.1Block Mode access over Ethernet Network interfaces that offload IP processing will be required for block-mode storage access over  ... 
doi:10.1145/944748.944749 fatcat:mpsgonhiw5d6tiqvr7pzbytuvi

The AXIOM software layers

Carlos Álvarez, Eduard Ayguadé, Jaume Bosch, Javier Bueno, Artem Cherkashin, Antonio Filgueras, Daniel Jiménez-González, Xavier Martorell, Nacho Navarro, Miquel Vidal, Dimitris Theodoropoulos, Dionisios N. Pnevmatikatos (+13 others)
2016 Microprocessors and microsystems  
This poses pressure onto systems design to support increasing demands on computational power, while keeping a low power envelop.  ...  AX-IOM proposes the use of the task-based OmpSs programming model, leveraging low-level communication interfaces provided by the hardware.  ...  He spent one year working with the BG/L team in the IBM Watson Research Center. He has coauthored more than 60 publications in international journals and conferences.  ... 
doi:10.1016/j.micpro.2016.07.002 fatcat:asg7sun5rvesjmkvnp5h5xad6a

Server I/O networks past, present, and future

Renato John Recio
2003 Proceedings of the ACM SIGCOMM workshop on Network-I/O convergence experience, lessons, implications - NICELI '03  
Recently several technologies have emerged that enable a single interconnect to be used as more than one fabric type.  ...  enabling network convergence; and how these new technologies are being deployed on various network families.  ...  This section will explore these two inhibitors in more detail. 7.1.1Block Mode access over Ethernet Network interfaces that offload IP processing will be required for block-mode storage access over  ... 
doi:10.1145/944747.944749 fatcat:4ytmiuwahba6jgf4y2d7kq2pey

Microbenchmark performance comparison of high-speed cluster interconnects

Jiuxing Liu, B. Chandrasekaran, Weikuan Yu, Jiesheng Wu, D. Buntinas, S. Kini, D.K. Panda, P. Wyckoff
2004 IEEE Micro  
Host communication overhead Cost of checking for communication completion Because all three interconnects support RDMA, one way to detect the arrival of messages at the receiver side is to poll the destination  ...  For example, these tests use a single buffer at both the sender and receiver sides. In contrast, real applications might use multiple communication buffers on each side.  ... 
doi:10.1109/mm.2004.1268994 fatcat:2tmrx5boybgtxkvbhlmvweey3y
« Previous Showing results 1 — 15 out of 610 results