A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2007; you can also visit the original URL.
The file type is
We have developed a hierarchical performance bounding methodology that attempts to explain the performance of loop-dominated scientific applications on particular systems. The Kendall Square Research KSR1 is used as a running example. We model the throughput of key hardware units that are common bottlenecks in concurrent machines. The four units currently used are: memory port, floating-point, instruction issue, and a loop-carried dependence pseudo-unit. We propose a workload characterization,doi:10.1109/icpp.1994.30 dblp:conf/icpp/BoydALSHD94 fatcat:mq3wqmpcpngdpmlpzt2dj2gdne