A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2010; you can also visit the original URL.
The file type is
Chip multiprocessors (CMP) are widely used for high performance computing. Further, these CMPs are being configured in a hierarchical manner to compose a node in a cluster system. A major challenge to be addressed is efficient use of such cluster systems for large-scale scientific applications. In this paper, we quantify the performance gap resulting from using different number of processors per node; this information is used to provide a baseline for the amount of optimization needed whendoi:10.1109/icpp-w.2008.21 dblp:conf/icppw/WuTLS08 fatcat:tb7h5un3mfec7lskh64ydgdkx4