Winning with Pinning in NoC

Ahmed Abousamra, Rami Melhem, Alex Jones
2009 2009 17th IEEE Symposium on High Performance Interconnects  
In Chip Multiprocessors (CMPs), on-chip interconnect carries data and coherence traffic exchanged between onchip cache banks. Reducing communication latency is critical for improving the performance of applications running on CMPs. Communication latency is affected by network design, cache organization, and application design. Previously proposed techniques for reducing router latency using express virtual channels or hybrid circuit switching effectively reduce communication latency. However,
more » ... r analysis of communication traffic of a suite of scientific and commercial workloads on a 16-core cachecoherent CMP showed low utilization of circuits due to repeated establishment and tear down of circuits. In this paper, we explore circuit pinning, an efficient way of establishing circuits that promotes higher circuit utilization, adapts to changes in communication characteristics, simplifies network control, and allows smarter routing techniques due to the stability of configured circuits. Comparison with state of the art packet switched and hybrid circuit switched interconnects across different cache organizations demonstrates the benefits of our technique. I. INTRODUCTION CMP systems rely on the on-chip interconnect to facilitate communication between different cores, cache banks, memory controllers, and shared on-chip functional units. Latency of the communication over the interconnect has a significant effect on CMP system performance. This effect will continue to increase as the technology scales down enabling many more cores to be put on a chip. Previous research has attempted to reduce communication latency by a variety of ways. Many designs have been proposed to reduce global hop count: Flattened butterfly topology [10] uses high radix routers to enrich connectivity; Concentrated mesh [2] shares each router among multiple nodes; Hybrid Ring/Mesh interconnect [4] breaks the 2D mesh interconnect into smaller mesh interconnects connected by a global ring; Hybrid Mesh/Bus interconnect [6] uses buses as local interconnects and uses a global mesh interconnect to connect the buses; and in 3D stacked chips, a low-radix and low-diameter 3D interconnect [18] connects every pair of communication points in at most 3 hops. Another approach for reducing communication latency is reducing router latency: express virtual channels [11] and hybrid circuit switching interconnect [8] reduce router latency by allowing part of the traffic to bypass the router pipeline. Schemes that reduce hop count can be combined with schemes that reduce router latency to further improve communication latency. Reducing router latency is achieved in [8], [11] by having preconfigured, complete or partial, circuits between source 1 Figures 1 and 2 were produced using the simulator described in section VII with SNUCA L2 and 1MB L2 bank size
doi:10.1109/hoti.2009.15 dblp:conf/hoti/AbousamraMJ09 fatcat:yx6v5t3qw5grjfxladscd3c7ya