A Hierarchical Approach to Modeling and Improving the Performance of Scientific Applications on the KSR1

E.L. Boyd, W. Azeem, Hsien-Hsin Lee Hsien-Hsin Lee, Tien-Pao Shih Tien-Pao Shih, Shih-Hao Hung Shih-Hao Hung, E.S. Davidson
1994 1994 International Conference on Parallel Processing Vol. 3  
We have developed a hierarchical performance bounding methodology that attempts to explain the performance of loop-dominated scientific applications on particular systems. The Kendall Square Research KSR1 is used as a running example. We model the throughput of key hardware units that are common bottlenecks in concurrent machines. The four units currently used are: memory port, floating-point, instruction issue, and a loop-carried dependence pseudo-unit. We propose a workload characterization,
more » ... nd derive upper bounds on the performance of specific machine-workload pairs. Comparing delivered performance with bounds focuses attention on areas for improvement and indicates how much improvement might be attainable. We delineate a comprehensive approach to modeling and improving application performance on the KSR1. Application of this approach is being automated for the KSR1 with a series of tools including K-MA and K-MACSTAT (which enable the calculation of the MACS hierarchy of performance bounds), K-Trace (which allows parallel code to be instrumented to produce a memory reference trace), and K-Cache (which simulates inter-cache communications based on a memory reference trace).
doi:10.1109/icpp.1994.30 dblp:conf/icpp/BoydALSHD94 fatcat:mq3wqmpcpngdpmlpzt2dj2gdne