On the trade-off between power and flexibility of FPGA clock networks
ACM Transactions on Reconfigurable Technology and Systems
________________________________________________________________________ FPGA clock networks consume a significant amount of power since they toggle every clock cycle and must be flexible enough to implement the clocks for a wide range of different applications. The efficiency of FPGA clock networks can be improved by reducing this flexibility; however, reducing the flexibility introduces stricter constraints during the clustering and placement stages of the FPGA CAD flow. These constraints can
... reduce the overall efficiency of the final implementation. This paper examines the tradeoff between the power consumption and flexibility of FPGA clock networks. Specifically, this paper makes three contributions. First, it presents a new parameterized clock network framework for describing and comparing FPGA clock networks. Second, it describes new clock-aware placement techniques that are needed to find a legal placement that satisfies the constraints imposed by the clock network. Finally, it performs an empirical study to examine the tradeoff between the power consumption of the clock network and the impact of the CAD constraints for a number of different clock networks with varying amounts of flexibility. The results show that the techniques used to produce a legal placement can have a significant influence on power and the ability of the placer to find a legal solution. On average, circuits placed using the most effective techniques dissipate 5% less overall energy and were significantly more likely to be legal than circuits placed using other techniques. Moreover, the results show that the architecture of the clock network is also important. On average, FPGAs with an efficient clock network were up to 14.6% more energy efficient compared to other FPGAs. ________________________________________________________________________ INTRODUCTION With advancements in process technology, programmable architecture, and computeraided design (CAD), field-programmable gate arrays (FPGAs) are now being used to implement and prototype large system-level applications. These applications often have several clock domains. In order to support applications with multiple clock domains, FPGA vendors incorporate complex clock distribution circuitry within their devices [Designing a suitable clock distribution network for an FPGA is significantly more challenging than designing such a network for a fixed function chip such as an Application-Specific Integrated Circuit (ASIC). In an ASIC, the locations and skew requirements of each domain are known when the clock network is designed. In an FPGA, however, a single clock network that works well across many applications must be created. When the FPGA is designed, the number of clock domains the user will require, the clock signals that will be generated, the skew requirements of each domain, and where each domain will be located on the chip are all unknown. This forces FPGA vendors to create very complex yet flexible clock distribution circuitry. This flexibility has a significant area and power overhead. Power is a particular concern, since the clock signals toggle every clock cycle and are connected to a large number of the flip-flops. Previous studies have indicated that in a typical FPGA, 19% of the total FPGA power is dissipated in the clock network [Tuan 2006 ]. The more flexible the clock network, the more parasitic capacitance on the clock nets, and the more routing switches traversed by each clock signal; this leads to increased power dissipation. Clearly, FPGA vendors must carefully balance the flexibility of their clock distribution networks and the power dissipated by these networks.