FLiMS: Fast Lightweight Merge Sorter

Philippos Papaphilippou, Chris Brooks, Wayne Luk
2018 2018 International Conference on Field-Programmable Technology (FPT)  
We have developed a highly-efficient and simple parallel hardware design for merging two sorted lists residing in banked (or multi-ported) memory. The FPGA implementation uses half the hardware resources required for implementing the current state-of-the-art architecture. This is achieved with better performance and half the latency, for the same amount of parallelism. The challenges for the merge operations in FPGAs have been the low clock frequency due to the feedback datapath of the merger
more » ... ing the critical path for timing, and also the high resource utilisation in recent attempts to eliminate/remove the feedback datapath. Our solution uses a modified version of the bitonic merge block, as found in a bitonic sorter, repurposed for performing parallel merge for streaming data. As with the state-of-the-art, it can be considered feedback-less since it only nests one parallel comparison for any desired level of parallelism. This leads to high operating frequency designs, 1.3 times higher than the previous best on our test platform. Since the new design uses 2 times fewer hardware resources, it allows more parallelism and leaves room for additional logic.
doi:10.1109/fpt.2018.00022 dblp:conf/fpt/PapaphilippouBL18 fatcat:yimiidhh4jc3bpxracyus735c4