Exploiting graphical processing units for data-parallel scientific applications
Concurrency and Computation
Graphical Processing Units (GPUs) have recently attracted attention for scientific applications such as particle simulations. This is partially driven by low commodity pricing of GPUs but also by recent toolkit and library developments that make them more accessible to scientific programmers. We report on two further application paradigms -regular mesh field equations with unusual boundary conditions and graph analysis algorithms -that can also make use of GPU architectures. We discuss the
... ance of these application paradigms to simulations engines and games. GPUs were aimed primarily at the accelerated graphics market but since this is often closely coupled to advanced game products it is interesting to speculate about the future of fully integrated accelerator hardware for both visualisation and simulation combined. As well as reporting speed-up performance on selected simulation paradigms, we discuss suitable data-parallel algorithms and present code examples for exploiting GPU features like large numbers of threads and localised texture memory. We find a surprising variation in the performance that can be achieved on GPUs for our applications and discuss how these findings relate to past known effects in parallel computing such as memory speed-related super-linear speed-up. / * * * R e t u r n s t h e number o f t h r e a d s p e r b l o c k t h a t g i v e s t h e h i g h e s t * m u l t i p r o c e s s o r o c c u p a n c y . P r e f e r s l a r g e r t h r e a d b l o c k s o v e r s m a l l e r * r e g i s t e r s The number o f r e g i s t e r s n e e d e d by e a c h t h r e a d . * sharedMemPerThread The s h a r e d memory n e e d e d p e r t h r e a d ( b y t e s ) .