A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning
2011
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11
We are witnessing a rapid evolution of HPC node architectures and on-chip parallelism as power and cooling constraints limit increases in microprocessor clock speeds. In this work, we demonstrate a hierarchical approach towards effectively extracting performance for a variety of emerging multicore-based supercomputing platforms. Our examined application is a structured grid-based Lattice Boltzmann computation that simulates homogeneous isotropic turbulence in magnetohydrodynamics. First, we
doi:10.1145/2063384.2063458
dblp:conf/sc/WilliamsOCS11
fatcat:e7snov63dvfklehytpcqbfb7xi