LogGPO: An accurate communication model for performance prediction of MPI programs

WenGuang Chen, JiDong Zhai, Jin Zhang, WeiMin Zheng
2009 Science in China Series F Information Sciences  
Message passing interface (MPI) is the de facto standard in writing parallel scientific applications on distributed memory systems. Performance prediction of MPI programs on current or future parallel systems can help to find system bottleneck or optimize programs. To effectively analyze and predict performance of a large and complex MPI program, an efficient and accurate communication model is highly needed. A series of communication models have been proposed, such as the LogP model family,
more » ... ch assume that the sending overhead, message transmission, and receiving overhead of a communication is not overlapped and there is a maximum overlap degree between computation and communication. However, this assumption does not always hold for MPI programs because either sending or receiving overhead introduced by MPI implementations can decrease potential overlap for large messages. In this paper, we present a new communication model, named LogGPO, which captures the potential overlap between computation with communication of MPI programs. We design and implement a trace-driven simulator to verify the LogGPO model by predicting performance of point-to-point communication and two real applications CG and Sweep3D. The average prediction errors of LogGPO model are 2.4% and 2.0% for these two applications respectively, while the average prediction errors of LogGP model are 38.3% and 9.1% respectively. performance prediction, communication model, LogP, LogGPO, MPI forms. The effects of network contention on communication cost are analyzed in LoPC model [7] and LoGPC model [8] . Although these models show good accuracy for low-level communication libraries such as Active Message or Elan Library, hardware-parameterized models ignore the increasing effects of highlevel communication libraries, such as MPI, on communication cost. To address this problem, some software-parameterized models have been proposed, such as LogGPS model [9] and Log n P model [10] . LogGPS model analyzes the communication cost needed for different protocol switching in MPI programs. Log n P model considers the cost of memory operations for non-continuous messages transmission. Non-blocking communication is widely used to hide communication latency through overlapping useful computation in MPI programs. The potential overlap degree of computation and communication can have great impact on communication performance of MPI programs. However, most contemporary MPI implementations are not able to provide true overlap between computation and communication even with nonblocking message passing interface. In fact, significant communication overhead can be introduced at actual message transmission.
doi:10.1007/s11432-009-0161-2 fatcat:tzrhi7lgkzcgxeyfoinxeag32q