Optimizing Instruction-set Extensible Processors under Data Bandwidth Constraints
2007 Design, Automation & Test in Europe Conference & Exhibition
We present a methodology for generating optimized architectures for data bandwidth constrained extensible processors. We describe a scalable Integer Linear Programming (ILP) formulation, that extracts the most profitable set of instruction-set extensions given the available data bandwidth and transfer latency. Unlike previous approaches, we differentiate between number of inputs and outputs for instruction-set extensions and the number of register file ports. This differentiation makes our
... ach applicable to architectures that include architecturally visible state registers and dedicated data transfer channels. We support a comprehensive design space exploration to characterize the area/performance trade-offs for various applications. We evaluate our approach using actual ASIC implementations to demonstrate that our automatically customized processors meet timing within the target silicon area. For an embedded processor with only two register read ports and one register write port, we obtain up to 4.3× speed-up with extensions incurring only a 35% area overhead. 2. We integrate our technique into an optimizing compiler that generates custom ASIC processor implementations from C code. 3. We consider silicon cell area as a primary constraint, and we explore the impact of different area constraints on the number of execution cycles and cycle time.