Implementation of a Fully-Parallel Turbo Decoder on a General-Purpose Graphics Processing Unit

An Li, Robert G. Maunder, Bashir M. Al-Hashimi, Lajos Hanzo
2016 IEEE Access  
Turbo codes comprising a parallel concatenation of upper and lower convolutional codes are widely employed in state-of-the-art wireless communication standards, since they facilitate transmission throughputs that closely approach the channel capacity. However, this necessitates high processing throughputs in order for the turbo code to support real-time communications. In stateof-the-art turbo code implementations, the processing throughput is typically limited by the data dependencies that
more » ... r within the forward and backward recursions of the Log-BCJR algorithm, which is employed during turbo decoding. In contrast to the highly-serial Log-BCJR turbo decoder, we have recently proposed a novel Fully Parallel Turbo Decoder (FPTD) algorithm, which can eliminate the data dependencies and perform fully parallel processing. In this paper, we propose an optimized FPTD algorithm, which reformulates the operation of the FPTD algorithm so that the upper and lower decoders have identical operation, in order to support Single Instruction Multiple Data (SIMD) operation. This allows us to develop a novel General Purpose Graphics Processing Unit (GPGPU) implementation of the FPTD, which has application in Software-Defined Radios (SDRs) and virtualized Cloud-Radio Access Networks (C-RANs). As a benefit of its higher degree of parallelism, we show that our FPTD improves the higher processing throughput of the Log-BCJR turbo decoder by between 2.3 and 9.2 times, when employing a high-specification GPGPU. However, this is achieved at the cost of a moderate increase of the overall complexity by between 1.7 and 3.3 times.
doi:10.1109/access.2016.2586309 fatcat:cqeypibnrjaabe54xmpj3b2slq