Heterogeneous coarse-grained processing elements: A template architecture for embedded processing acceleration

G. Ansaloni, P. Bonzini, L. Pozzi
2009 2009 Design, Automation & Test in Europe Conference & Exhibition  
Reconfigurable Architectures are good candidates for application accelerators that cannot be set in stone at production time. FPGAs however, often suffer from the area and performance penalty intrinsic in gate-level reconfigurability. To reduce this overhead, coarse-grained reconfigurable arrays (CGRAs) are reconfigurable at the ALU level, but a successful design needs more than computational power-the main bottleneck usually being memory transfers. Just like the integration of hardwired
more » ... ier and memory blocks enabled FPGAs to efficiently implement digital signal processing applications, in this paper we study a customizable architecture template based on heterogeneous processing elements (multipliers, ALU clusters and memories) that provides enough flexibility to realize fast pipelined implementations of various loop kernels on a CGRA.
doi:10.1109/date.2009.5090723 dblp:conf/date/AnsaloniBP09 fatcat:twqcaz7vyzgqtnemo2ch2rqmqm