Remarn: A Reconfigurable Multi-threaded Multi-core Accelerator for Recurrent Neural Networks

Zhiqiang Que, Hiroki Nakahara, Hongxiang Fan, He Li, Jiuxi Meng, Kuen Hung Tsoi, Xinyu Niu, Eriko Nurvitadhi, Wayne Luk
<span title="2022-05-17">2022</span> <i title="Association for Computing Machinery (ACM)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/qfvmhupy5fb6tcrtb2orjpq5e4" style="color: black;">ACM Transactions on Reconfigurable Technology and Systems</a> </i> &nbsp;
This work introduces Remarn, a reconfigurable multi-threaded multi-core accelerator supporting both spatial and temporal co-execution of Recurrent Neural Network (RNN) inferences. It increases processing capabilities and quality of service of cloud-based neural processing units (NPUs) by improving their hardware utilization and by reducing design latency, with two innovations. First, a custom coarse-grained multi-threaded (CGMT) RNN/LSTM hardware architecture, switching tasks among threads when
more &raquo; ... RNN computational engines meet data hazards. Second, the partitioning of this hardware architecture into multiple full-fledged sub-accelerator cores, enabling spatially co-execution of multiple RNN/LSTM inferences. These innovations improve the exploitation of the available parallelism to increase run-time hardware utilization and boost design throughput. Evaluation results show that a dual-threaded quad-core Remarn NPU achieves 2.91 times higher performance while only occupying 5.0% more area than a single-threaded one on a Stratix 10 FPGA. When compared with a Tesla V100 GPU implementation, our design achieves 6.5 times better performance and 15.6 times higher power efficiency, showing that our approach contributes to high performance and energy-efficient FPGA-based multi-RNN inference designs for datacenters.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/3534969">doi:10.1145/3534969</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/g2lj3qnv3bdsxb5mkmwlkoezpu">fatcat:g2lj3qnv3bdsxb5mkmwlkoezpu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220518132420/https://dl.acm.org/doi/pdf/10.1145/3534969" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/bc/7e/bc7e3fb53c7c36467ecfd09559b5ff32446818ae.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/3534969"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>