FAMOUS, faster: using parallel computing techniques to accelerate the FAMOUS/HadCM3 climate model with a focus on the radiative transfer algorithm

P. Hanappe, A. Beurivé, F. Laguzet, L. Steels, N. Bellouin, O. Boucher, Y. H. Yamazaki, T. Aina, M. Allen
2011 Geoscientific Model Development  
<p><strong>Abstract.</strong> We have optimised the atmospheric radiation algorithm of the FAMOUS climate model on several hardware platforms. The optimisation involved translating the Fortran code to <i>C</i> and restructuring the algorithm around the computation of a single air column. Instead of the existing MPI-based domain decomposition, we used a task queue and a thread pool to schedule the computation of individual columns on the available processors. Finally, four air columns are packed
more » ... together in a single data structure and computed simultaneously using Single Instruction Multiple Data operations. <br><br> The modified algorithm runs more than 50 times faster on the CELL's <i>Synergistic Processing Element</i> than on its main PowerPC processing element. On Intel-compatible processors, the new radiation code runs 4 times faster. On the tested graphics processor, using OpenCL, we find a speed-up of more than 2.5 times as compared to the original code on the main CPU. Because the radiation code takes more than 60 % of the total CPU time, FAMOUS executes more than twice as fast. Our version of the algorithm returns bit-wise identical results, which demonstrates the robustness of our approach. We estimate that this project required around two and a half man-years of work.</p>
doi:10.5194/gmd-4-835-2011 fatcat:kxke3exme5df7o4jroojsrtjsi