The effect of LUT and cluster size on deep-submicron FPGA performance and density

Elias Ahmed, Jonathan Rose
2000 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays - FPGA '00  
In this paper we revisit the FPGA architectural issue of the effect of logic block functionality on FPGA performance and density. In particular, in the context of lookup table, cluster-based island-style FPGAs [4] we look at the effect of lookup table (LUT) size and cluster size (number of LUTs per cluster) on the speed and logic density of an FPGA. Although this question was addressed some time ago in [17] [18] [12] [13] [10] and [22], several reasons compelled us to revisit the issue. First,
more » ... rior work focused on non-clustered logic blocks, which are known to have a significant impact on the area and delay [16] . Second, most prior studies tended to look at area or delay, but not both as we will here. Third, prior results were based on IC process generations that are several factors larger than current process generations, and so do not take deep-submicron electrical effects into account. In the present work, we perform detailed spice-level simulations of circuits and perform appropriate buffer and transistor sizing for all the logic and routing elements, in the manner of [4] . Fourth, the CAD tools available today for experimentation are significantly better than those available 10 years ago, when this question was first raised. Our new results show that the superior tools give rise to different trends in the explanation of the results. Finally, a recent publication [11] has suggested that a more fine-grained logic block (smaller LUT size) is a better choice than was previously thought. We use a fully timing-driven experimental flow [4] [15] in which a set of benchmark circuits are synthesized into different clusterbased [2] [3] [15] logic block architectures, which contain groups of LUTs and flip-flops. We look across all architectures with LUT sizes in the range of 2 inputs to 7 inputs, and cluster size from 1 to 10 LUTs. In order to judge the quality of the architecture we do both detailed circuit level design and measure the demand of routing resources for every circuit in each architecture. These experiments have resulted in several key contributions. First, we have experimentally determined the relationship between the number of inputs required for a cluster as a function of the LUT size (K) and cluster size (N). Second, contrary to previous results,
doi:10.1145/329166.329171 dblp:conf/fpga/AhmedR00 fatcat:vvvzkqslgfb7do5prbs25jrs3q