Architectural techniques for accelerating subword permutations with repetitions

J.P. McGregor, R.B. Lee
2003 IEEE Transactions on Very Large Scale Integration (vlsi) Systems  
We propose two new instructions, swperm and sieve, that can be used to efficiently complete an arbitrary bit-level permutation of an -bit word with or without repetitions. Permutations with repetitions are rearrangements of an ordered set in which elements may replace other elements in the set; such permutations are useful in cryptographic algorithms. On a four-way superscalar processor, we can complete an arbitrary 64-bit permutation with repetitions of 1-bit subwords in 11 instructions and
more » ... y four cycles using the two proposed instructions. For subwords of size 4 bits or greater, we can perform an arbitrary permutation with repetitions of a 64-bit register in a single cycle using a single swperm instruction. This improves upon previous results by requiring fewer instructions to permute 4-bit or larger subwords packed in a 64-bit register and fewer execution cycles for 1-bit subwords on wide superscalar processors. We also demonstrate that we can accelerate the performance of the popular DES block cipher using the proposed instructions. We obtain a DES performance improvement of at least 55% in constrained embedded environments and an improvement of 71% on a four-way superscalar processor when applying DES as a cryptographic hash function.
doi:10.1109/tvlsi.2003.812318 fatcat:z5iysudvgrhgbpaydfgshymxra