High Performance Data Transfer in Grid Environment Using GridFTP over InfiniBand

Hari Subramoni, Ping Lai, Raj Kettimuthu, Dhabaleswar K. Panda
2010 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing  
GridFTP has established itself as a popular tool for data transfer in the grid environment. GridFTP is designed on top of the Globus XIO framework which makes the task of integrating new transport protocols and storage systems into it a very easy one. The performance of GridFTP is highly dependant on the disk I/O techniques, as well as the underlying network communication protocols. Though GridFTP has many optimizations for disk I/O operations, the relatively low communication bandwidth offered
more » ... by the existing network protocols presents a bottleneck to next generation exascale applications which would want to transfer exabytes of data across long distances. On the other hand, modern interconnects such as InfiniBand have introduced many advanced communication features, like zero-copy protocol and RDMA operations which can greatly improve the communication efficiency. Moreover, the recently introduced InfiniBand WAN routers such as Obsidian Longbows now give us the ability to extend the reach of InfiniBand to WAN distances. In this paper, we take on the challenge of combining the ease of use of the Globus XIO framework and the high performance achieved through InfiniBand communication, thereby natively supporting GridFTP over In-finiBand based networks. The Advanced Data Transfer Service (ADTS) designed in our previous work provides the low level InfiniBand support to the Globus XIO layer *so that it can be used by GridFTP as well as by end-user Grid applications built on top of Globus XIO. In order to achieve efficient disk based data transfers, we introduce the concepts of I/O staging in the Globus XIO ADTS driver. We evaluate our design in both LAN and WAN environments using microbenchmarks as well as communication traces from several real world applications. We also provide insights into the communication performance with some indepth analysis. The results of our experimental evaluation shows a performance improvement of up to 100% for ADTS based data transfers as compared with TCP or UDP based ones in LAN as well as high delay WAN scenarios. emerged as the most popular FTP implementation in the Grid environment. Cluster Computing Computing Cluster Cluster Visualization Storage Cluster Storage Cluster
doi:10.1109/ccgrid.2010.115 dblp:conf/ccgrid/SubramoniLKP10 fatcat:5ebdlsdzvzdtvaeatdnlza3iti