Efficient and portable acceleration of quantum chemical many-body methods in mixed floating point precision using OpenACC compiler directives

Janus J. Eriksen
2016 Molecular Physics  
It is demonstrated how the non-proprietary OpenACC standard of compiler directives may be used to compactly and efficiently accelerate the rate-determining steps of two of the most routinely applied many-body methods of electronic structure theory, namely the second-order Møller-Plesset (MP2) model in its resolution-of-the-identity (RI) approximated form and the (T) triples correction to the coupled cluster singles and doubles model (CCSD(T)). By means of compute directives as well as the use
more » ... optimized device math libraries, the operations involved in the energy kernels have been ported to graphics processing unit (GPU) accelerators, and the associated data transfers correspondingly optimized to such a degree that the final implementations (using either double and/or single precision arithmetics) are capable of scaling to as large systems as allowed for by the capacity of the host central processing unit (CPU) main memory. The performance of the hybrid CPU/GPU implementations is assessed through calculations on test systems of alanine amino acid chains using one-electron basis sets of increasing size (ranging from double- to pentuple-zeta quality). For all but the smallest problem sizes of the present study, the optimized accelerated codes (using a single multi-core CPU host node in conjunction with six GPUs) are found to be capable of reducing the total time-to-solution by at least an order of magnitude over optimized, OpenMP-threaded CPU-only reference implementations.
doi:10.1080/00268976.2016.1271155 fatcat:xjikupqqengx5jr5mu2dj74twu