High performance predictable histogramming on GPUs

Cedric Nugteren, Gert-Jan van den Braak, Henk Corporaal, Bart Mesman
2011 Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units - GPGPU-4  
Graphics Processing Units (GPUs) are suitable for highly data parallel algorithms such as image processing, due to their massive parallel processing power. Many image processing applications use the histogramming algorithm, which fills a set of bins according to the frequency of occurrence of pixel values taken from an input image. Histogramming has been mapped on a GPU prior to this work. Although significant research effort has been spent in optimizing the mapping, we show that the
more » ... and performance predictability of existing methods can still be improved. In this paper, we present two novel histogramming methods, both achieving a higher performance and predictability than existing methods. We discuss performance limitations for both novel methods by exploring algorithm trade-offs. Both the novel and the existing histogramming methods are evaluated for performance. The first novel method gives an average performance increase of 33% over existing methods for non-synthetic benchmarks. The second novel method gives an average performance increase of 56% over existing methods and guarantees to be fully data independent. While the second method is specifically designed for newer GPU architectures, the first method is also suitable for older architectures.
doi:10.1145/1964179.1964181 dblp:conf/asplos/NugterenBCM11 fatcat:o2xtvsjrz5hn5og6ua6jcjh25u