Ring-Mesh: A Scalable and High-Performance Approach for Manycore Accelerators [article]

Somnath Mazumdar, Alberto Scionti
2019 arXiv   pre-print
There is an increasing number of works addressing the design challenge of fast, scalable solutions for the growing machine learning based application domain. Recently, most of the solutions aimed at improving processing element capabilities to speed up the execution of deep learning (DL) application. However, only a few works focused on the interconnection subsystem as a potential source of performance improvement. Wrapping many cores together offer excellent parallelism, but it comes with
more » ... ple challenges (e.g., adequate interconnections). Scalable, power-aware interconnects are required to support such a growing number of processing elements, as well as modern applications. In this paper, we propose a scalable and efficient Network-on-Chip (NoC) architecture fusing the advantages of rings as well as the 2D-mesh without using any bridge router to provide high-performance. A dynamic adaptation mechanism allows to better adapt to the application requirements. Simulation results show better scalability (up to 1024 processing elements) with robust performance in multiple statistical traffic pattern scenarios.
arXiv:1904.03428v1 fatcat:avwhcxwgkzfc5ma5uwcimiex34