Development of efficient GPU parallelization of WRF Yonsei University planetary boundary layer scheme

M. Huang, J. Mielikainen, B. Huang, H. Chen, H.-L. A. Huang, M. D. Goldberg
2014 Geoscientific Model Development Discussions  
The planetary boundary layer (PBL) is the lowest part of the atmosphere and where its character is directly affected by its contact with the underlying planetary surface. The PBL is responsible for vertical sub-grid-scale fluxes due to eddy transport in the whole atmospheric column. It determines the flux profiles within the well-mixed boundary layer and the more stable layer above. It thus provides an evolutionary model of atmospheric temperature, moisture (including clouds), and horizontal
more » ... , and horizontal momentum in the entire atmospheric column. For such purposes, several PBL models have been proposed and employed in the weather research and forecasting (WRF) model of which the Yonsei University (YSU) scheme is one. To expedite weather research and prediction, we have put tremendous effort into developing an accelerated implementation of the entire WRF model using Graphics Processing Unit (GPU) massive parallel computing architecture whilst maintaining its accuracy as compared to its CPU-based implementation. This paper presents our efficient GPU-based design on WRF YSU PBL scheme. Using one NVIDIA Tesla K40 GPU, the GPU-based YSU PBL scheme achieves a speedup of 193× with respect to its Central Processing Unit (CPU) counterpart running on one CPU core, whereas the speedup for one CPU socket (4 cores) with respect to one CPU core is only 3.5×. We can even boost the speedup to 360× with respect to one CPU core as two K40 GPUs are applied.
doi:10.5194/gmdd-7-8031-2014 fatcat:n4d6v7sffzcqzhybybdhuyum6q