A New Recurrence Formula for Efficient Computation of Spherical Harmonic Transform

2018 Journal of the Meteorological Society of Japan  
1 A new recurrence formula to calculate the associated Legendre functions is proposed for 2 efficient computation of the spherical harmonic transform. This new recurrence formula 3 makes the best use of the fused multiply-add (FMA) operations implemented in modern 4 computers. The computational speeds in calculating the spherical harmonic transform are 5 compared between a numerical code in which the new recurrence formula is implemented 6 and other code using the traditional recurrence
more » ... This comparison shows that 7 implementation of the new recurrence formula contributes to a faster transform. Fur-8 thermore, a scheme to maintain the accuracy of the transform, even when the truncation 9 wavenumber is huge, is also explained. 10 Keywords: associated Legendre functions, recurrence formula, spherical harmonic trans-11 form, spectral method, numerical library 12 In many global atmospheric general circulation models, e.g., OpenIFS (ECMWF, 2 2017), DCPAM (Takahashi, et al., 2016), etc, the spectral method with spherical har-3 monics is used in horizontal discretization. To evaluate the nonlinear terms in these 4 models, the transform method, which was introduced by Orszag (1970) , is used for com-5 putational efficiency. The transform method requires two kinds of transforms: Grid point 6 data are transformed to coefficients for the spherical harmonics in the forward transform 7 and the reverse is done in the backward transform. We call the two kinds of transforms 8 the spherical harmonic transform hereinafter. Because the spherical harmonic transform 9 must be computed many times per one time step in time integrations of the general 10 circulation models, it is a fundamental tool, but it is desirable to reduce the computa-11 tional time for the transform. Note that the spherical harmonic transform is also used in 12 other research areas, e.g., cosmic microwave background studies (Reinecke, et al., 2006), 13 planetary dynamos (Wicht and Tilgner, 2010), etc. 14 The spherical harmonic transform consists of the discrete Fourier transform for the 15 longitudinal direction and a transform from coefficients of associated Legendre functions 16 to function values at specified nodes, which we call the associated Legendre function 17 transform hereinafter, for the latitudinal direction. The discrete Fourier transform can 18 be computed with the fast Fourier transform (FFT) algorithm and its computational 19 complexity is O(M 2 log 2 M ), where M is the truncation wavenumber of the spherical 20 harmonic transform. The associated Legendre function transform has a computational 21 complexity of O(M 3 ), and so it is one of the most time-consuming parts in the computation 22 of dynamics in a spherical spectral model when the horizontal resolution of the model 23 becomes fine. For the associated Legendre function transform, no exact fast algorithm 24 has been discovered. However, several fast algorithms have been proposed to compute 25 1 Seljebotn (2012). In particular, in Seljebotn (2012), a fast algorithm the computational 2 complexity of O(M 2 log 2 M ) was proposed and the implementation detail was described. 3 Although such approximate fast algorithms may be promising when M is large because 4 of the asymptotic behavior of the computational complexity, they have disadvantages, for 5 example, that the implementations are very complicated, that they conduct the transform 6 approximately within a given accuracy, and that they require a huge memory area when 7 M is large. 8 As a strategy to deal with the computational complexity for the associated Legen-9 dre function transform other than exploring a fast algorithm, optimizing corresponding 10 numerical codes is a natural choice. Schaeffer (2013) showed that his highly optimizing 11 numerical code for the spherical harmonic transform spent much less computational time 12 than that spent by other competitive codes, which included numerical codes based on 13 fast algorithms. While several tuning techniques have been incorporated into his nu-14 merical code, one of the most essential parts is that the associated Legendre functions 15 are computed "on-the-fly" during the transform rather than computed and stored before 16 the transform. The reason why this on-the-fly computation contributes to speed tun-17 ing is that it requires much less memory (O(M 2 )) than that required when computed 18 in advance (O(M 3 )). Therefore, this takes advantage of the cache memory on modern 19 computers. However, computing the associated Legendre functions on-the-fly has compu-20 tational complexity of the same order as that for the transform itself (O(M 3 )). Because 21 the associated Legendre functions are computed by using a recurrence formula, a more 22
doi:10.2151/jmsj.2018-019 fatcat:ma5v776j3vexthjadevdzepjte