On characterizing bandwidth requirements of parallel applications

Anand Sivasubramaniam, Aman Singla, Umakishore Ramachandran, H. Venkateswaran
1995 Performance Evaluation Review  
Synthesizing architectural requirements from an application viewpoint can help in making important architectural design decisions towards building large scale parallel machines. In this paper, we quantify the link bandwidth requirement on a binary hypercube topology for a set of five parallel applications. We use an executiondriven simulator called SPASM to collect data points for system sizes that are feasible to be simulated. These data points are then used in a regression analysis for
more » ... ing the link bandwidth requirements for larger systems. The requirements are projected as a function of the following system parameters: number of processors, CPU clock speed, and problem size. These results are also used to project the link bandwidths for other network topologies. Our study quantifies the link bandwidth that has to be made available to limit the network overhead in an application to a specified tolerance level. The results show that typical link bandwidths (200-300 MBytes/sec) found in current commercial parallel architectures (such as Intel Paragon and Cray T3D) would have fairly low network overhead for the applications considered in this study. For two of the applications, this overhead is negligible. For the other applications, this overhead can be limited to about 30% of the execution time provided the problem sizes are increased commensurate with the processor clock speed. The technique presented can be useful to a system architect to synthesize the bandwidth requirements for realizing well-balanced parallel architectures. ¤ -ary ¥ -cube networks. The results suggest that low-dimensional networks are preferred (based on physical and technological constraints) when the network contention is ignored or when the workload (the application) exhibits sufficient network locality; and that higher dimensional networks may be needed otherwise. Adve and Vernon [1] show using analytical models that network locality has an important role to play in the performance of the mesh. Since network requirements are sensitive to the workload, it is necessary to study them in the context of real applications. The RISC ideology clearly illustrates the importance of using real applications in synthesizing architectural requirements. Several researchers have used this approach for parallel architectural studies [20, 7, 13] . Cypher et al. [7] use a range of scientific applications in quantifying the processing, memory, communication and I/O requirements. They present the communication requirements in terms of the number of messages exchanged between processors and the volume (size) of these messages. As identified in [22] , communication in parallel applications may be categorized by the following attributes: communication volume, the communication pattern, the communication frequency and the ability to overlap communication with computation. A static analysis of the communication as conducted in [7] fails to capture the last two attributes, making it very difficult to quantify the contention in the system. The importance of simulation in capturing the dynamics of parallel system (an application-architecture combination) behavior has been clearly illustrated in [12, 22, 25] . In particular, using an execution-driven simulator, one can faithfully capture all the attributes of communication that are important to network requirements synthesis. For example, in [12] the authors use an execution-© 46¥ 61 0¥ 006¨0
doi:10.1145/223586.223609 fatcat:rtimlc5tlng4doq5ceorrrxyg4