Nano-Photonic Networks-on-Chip for Future Chip Multiprocessors
More than Moore Technologies for Next Generation Computer Design
1 2 Cheng Li, Paul V. Gratz, and Samuel Palermo systems makes traditional electrical on-chip networks prohibitive for future transformative extrascale computers. Recently, monolithic silicon photonics have been proposed as a scalable alternative to meet future many-core systems bandwidth demands, by leveraging highspeed photonic devices [4, 5, 6] , THz-bandwidth waveguides [7, 8] , and immense bandwidth-density via wavelength-division-multiplexing (WDM) [9, 10] . Several NoC architectures
... ging the high bandwidth of silicon photonics have been proposed. These works can be categorized into two general types: 1). Hybrid optical/electrical interconnect architecture [11, 12, 13, 14] , in which a photonic packet-switched network and an electronic circuit-switched control network are combined to respectively deliver large size data messages and short control messages; 2). Crossbar or Clos architectures, in which the interconnect is fully photonic [15, 16, 17, 18, 19, 20, 21, 22, 23] . Although these designs provide high and scalable bandwidth, they either suffer from relatively high latency due to the electrical control circuits for photonic path setup, or significant power/hardware overhead due to significant over-provisioned photonic channels. In future latency and power constrained CMPs, these characteristics will hobble the utility of photonic interconnect. In this chapter, we propose LumiNOC , a novel PNoC architecture which addresses power and resource overheads due to channel over-provisioning, while reducing latency and maintaining high bandwidth in CMPs. LumiNoC utilizes integrated silicon waveguides that provide the potential to overcome electrical interconnect bottlenecks and greatly improve data transfer efficiency due to their flat channel loss over a wide frequency range and also relatively small crosstalk and electromagnetic noise  . By combining multiple data channels on a single waveguide via wavelength-division-multiplexing (WDM), LumiNoC greatly improves bandwidth density. Area-compact and energy-efficient silicon ring resonators are employed as the optical modulator and drop filter in the integrated WDM link. Silicon ring resonator modulators/filters offer advantages of small size, relative to Mach-Zehnder modulators  , and increased filter functionality, relative to electro-absorption modulators  . The LumiNOC architecture makes three contributions: First, instead of conventional, globally distributed, photonic channels, requiring high laser power, we propose a novel channel sharing arrangement composed of sub-sets of cores in photonic subnets. Second, we propose a novel, purely photonic, distributed arbitration mechanism, dynamic channel scheduling, which achieves extremely lowlatency without degrading throughput. Third, our photonic network architecture leverages the same wavelengths for channel arbitration and parallel data transmission, allowing efficient utilization of the photonic resources and lowering static power consumption. We show in a 64-node implementation that LumiNOC enjoys 50% lower latency at low loads and ∼40% higher throughput per Watt on synthetic traffic versus previous PNoCs. Furthermore, LumiNOC reduces latency ∼40% versus an electrical 2D mesh NoCs on PARSEC shared-memory, multithreaded benchmark workloads.