Massive supercomputing coping with heterogeneity of modern accelerators

Toshio Endo, Satoshi Matsuoka
2008 Proceedings, International Parallel and Distributed Processing Symposium (IPDPS)  
Heterogeneous supercomputers with combined general-purpose and accelerated CPUs promise to be the future major architecture due to their wide-ranging generality and superior performance / power ratio. However, developing applications that achieve effective scalability is still very difficult, and in fact unproven on large-scale machines in such combined setting. We show that an effective method for such heterogeneous systems so that the porting from applications written with homogeneous
more » ... ons could be achieved. For this goal, we divide porting of applications into several steps, analyze performance of the kernel computation, create processes that virtualize the underlying processors, tune parameters with preferences to accelerators, and balance the load between heterogeneous nodes. We apply our method to the parallel Linpack benchmark on the TSUBAME heterogeneous supercomputer. We efficiently utilize both 10,000 general purpose CPU cores and 648 SIMD accelerators in a combined fashion-the resulting 56.43 TFlops utilized the entire machine, and not only ranked significantly on the Top500 supercomputer list, but also it is the highest Linpack performance on heterogeneous systems in the world.
doi:10.1109/ipdps.2008.4536251 dblp:conf/ipps/EndoM08 fatcat:vti2acezl5e7tidjcufqkdye7m