QCG-OMPI: MPI applications on grids

Emmanuel Agullo, Camille Coti, Thomas Herault, Julien Langou, Sylvain Peyronnet, Ala Rezmerita, Franck Cappello, Jack Dongarra
2011 Future generations computer systems  
Computational grids present promising computational and storage capacities. They can be made by punctual aggregation of smaller resources (i.e., clusters) to obtain a large-scale supercomputer. Running general applications is challenging for several reasons. The first one is inter-process communication: processes running on different clusters must be able to communicate with one another in spite of security equipments such as firewalls and NATs. Another problem raised by grids for
more » ... intensive parallel application is caused by the heterogeneity of the available networks that interconnect processes with one another. In this paper we present how QCG-OMPI can execute efficient parallel applications on computational grids. We first present an MPI programming, communication and execution middleware called QCG-OMPI. We then present how applications can make use of the capabilities of QCG-OMPI by presenting two typical, parallel applications: a geophysics application combining collective operations and a master-worker scheme, and a linear algebra application. (C. Coti). another. On the other hand, clusters must be protected from external intrusions by security equipments such as firewalls and NATs, which additionally are used to reduce the number of public IP addresses needed by clusters. To address this problem without requiring any trade off between security and connectivity, we designed QCG-OMPI 1 [5, 6] , an extended MPI library based on a framework providing several basic and advanced connectivity techniques aiming at firewall and NAT bypassing. The QosCosGrid 2 [7] project aims at developing and executing parallel applications on institutional grids. A full job management and execution stack has been designed in order to support applications on grids. QosCosGrid uses QCG-OMPI as its MPI implementation. Another problem raised by grids for communication-intensive parallel application is caused by the heterogeneity of the available networks that interconnect processes with one another. Local communication media have a lower latency and a higher bandwidth 1 QosCosGrid-OpenMPI, QosCosGrid standing for Quasi-Opportunistic Supercomputing in Grid environments. 2 Quasi-Opportunistic Supercomputing for Complex Systems in Grid Environments, http://www.qoscosgrid.eu. 0167-739X/$ -see front matter
doi:10.1016/j.future.2010.11.015 fatcat:c7wg66kyqfc37nfi7ypyybwjie