Optimierung von Life Sciences Algorithmen für GPUs mit CUDA/OpenCL

David Dilch
2013 unpublished
A research focus in scientific computing deals with the parallelisation of algorithms for GPUs because the theoretical maximum performance of GPUs is many times higher than of CPUs. The master thesis starts with the Nvidia Fermi GPU architecture and the changes compared to the predecessor architecture GT200. In the next step the GPU programming languages CUDA and OpenCL are explained and the differences in programming are compared. The aim of the master thesis is the optimization of two life
more » ... ence algorithms (Needleman-Wunsch/Smith-Waterman, Direct Coulomb Summation) for the execution on graphics cards (Nvidia-Fermi, Nvidia-GT200) by using the programming languages CUDA and OpenCL. The DCS algorithm is much more suitable for graphics cards than the NW/SW algorithm because the degree of parallelisation is significantly higher. In this master thesis, it is determined how effective the optimization techniques are on different GPU architectures (Fermi, GT200) and on different GPU programming languages (CUDA, OpenCL). In doing so, it is checked whether the new cache hierarchy for the global memory in the Fermi architecture replaces optimization techniques (memory coalescing, use of the small on chip memories) and therefore makes an easier transfer from CPU code to GPU code without losing performance possible. Another important issue of the master thesis is the fair comparison between GPU program and CPU program when answering the question whether the two algorithms can reach a speedup on the GPU in comparison with the CPU. For a fair comparison all cores on the CPU are used with the programming language OpenMP.
doi:10.25365/thesis.30677 fatcat:wc2pjbmspza6vdiejho3radjrq