A performance model of fast 2D-DCT parallel JPEG encoding using CUDA GPU and SMP-architecture

Mohammed K. Ali Shatnawi, Hussein Ali Shatnawi
2014 2014 IEEE High Performance Extreme Computing Conference (HPEC)  
The performance of image compression algorithms for big data can be enhanced using parallel computations. JPEG algorithm is a lossy compression method that uses DCT to eliminate high-frequency components. In this paper, we describe a cross-compatible design of JPEG on SMD and GPU architectures. To achieve maximal efficiency, we exploit the substantial parallelism to design an optimized version of JPEG based on thread model. A fair algorithm's evaluation on 24-bit BMP, using several performance
more » ... etrics, is run on the fully optimized GPU using CUDA and SMP using SESC simulator. Our cross-architectural evaluation results revealed a 25.49 speedup in SESC and 21 in GPU and that CPU outperformed GPU for the JPEG.
doi:10.1109/hpec.2014.7040947 dblp:conf/hpec/ShatnawiS14 fatcat:4xhkjw6qpfbivjml3zaug7aemi