A Flexible-Length-Arithmetic Processor Using Embedded DSP Slices and Block RAMs in FPGAs

Md. Nazrul Islam Mondal, Kohan Sai, Koji Nakano, Yasuaki Ito
2013 2013 First International Symposium on Computing and Networking  
Some applications such as RSA encryption/decryption needs integer arithmetic operations with many bits. However, such operations cannot be performed directly by conventional CPUs, because their instruction supports integers with fixed bits, say, 64 bits. Since the CPUs need to repeat arithmetic operations to numbers with fixed bits, they have considerably overhead to execute applications involving integer arithmetic with many bits. On the other hand, we can implement hardware algorithms for
more » ... applications in the FPGAs for further acceleration. However, the implementation of hardware algorithm is usually very complicated and debugging of hardware is too hard. The main contribution of this paper is to present an intermediate approach of software and hardware using FPGAs. More specifically, we present a processor based on FDFM (Few DSP slices and Few Memory blocks) approach that supports arithmetic operations with flexibly many bits, and implement it in the FPGA. Arithmetic instructions of our processor architecture include addition, subtraction, and multiplication for numbers with variable size longer than 64 bits. To show the potentiality of our processor, we have implemented 2048-bit RSA encryption/decryption by software written by machine instructions. The resulting processor uses only one DSP48E1 slices and four Block RAMs (BRAMs), and RSA encryption software on it runs in 635.65ms. It has been shown that the direct hardware implementation of RSA encryption runs in 277.26ms. Although our intermediate approach is slower, it has several advantages. Since the algorithm is written by software, the development and the debugging are easy. Also, it is more flexible and scalable.
doi:10.1109/candar.2013.19 dblp:conf/ic-nc/MondalSNI13 fatcat:ajkgh335mzhhtgbtgybu3oritu