Proceedings of the 24th ACM International Conference on Supercomputing - ICS '10
As the number of I/O-intensive MPI programs becomes increasingly large, many efforts have been made to improve I/O performance, on both software and architecture sides. On the software side, researchers can optimize processes' access patterns, either individually (e.g., by using large and sequential requests in each process), or collectively (e.g., by using collective I/O). On the architecture side, files are striped over multiple I/O nodes for a high aggregate I/O throughput. However, a key
... . However, a key weakness, the access interference on each I/O node, remains unaddressed in these efforts. When requests from multiple processes are served simultaneously by multiple I/O nodes, one I/O node has to concurrently serve requests from different processes. Usually the I/O node stores its data on the hard disks, and different process accesses different regions of a data set. When there are a burst of requests from multiple processes, requests from different processes to a disk compete with each other for its single disk head to access data. The disk efficiency can be significantly reduced due to frequent disk head seeks. In this paper, we propose a scheme, InterferenceRemoval, to eliminate I/O interference by taking advantage of optimized access patterns and potentially high throughput provided by multiple I/O nodes. It identifies segments of files that could be involved in the interfering accesses and replicates them to their respectively designated I/O nodes. When the interference is detected at an I/O node, some I/O requests can be re-directed to the replicas on other I/O nodes, so that each I/O node only serves requests from one or a limited number of processes. InterferenceRemoval has been implemented in the MPI library for high portability on top of the Lustre parallel file system. Our experiments with representative benchmarks, such as NPB BTIO and mpi-tile-io, show that it can significantly improve I/O performance of MPI programs. For example, the I/O throughput of mpitile-io can be increased by 105% as compared to that without using collective I/O, and by 23% as compared to that using collective I/O.