Optimized on-chip pipelining of memory-intensive computations on the cell BE

Christoph W. Kessler, Jörg Keller
2009 SIGARCH Computer Architecture News  
Multiprocessors-on-chip, such as the Cell BE processor, regularly suffer from restricted bandwidth to off-chip main memory. We propose to reduce memory bandwidth requirements, and thus increase performance, by expressing our application as a task graph, by running dependent tasks concurrently and by pipelining results directly from task to task where possible, instead of buffering in off-chip memory. To maximize bandwidth savings and balance load simultaneously, we solve a mapping problem of
more » ... ks to SPEs on the Cell BE. We present three approaches: an integer linear programming formulation that allows to compute Paretooptimal mappings for smaller task graphs, general heuristics, and a problem specific approximation algorithm. We validate the mappings for dataparallel computations and sorting.
doi:10.1145/1556444.1556450 fatcat:c2ajltxh3bcrxj26yvhno7e75i