Signal processing on platforms with multiple cores: Part 1 - Overview and methodologies [From the Guest Editors

Yen-Kuang Chen, Chaitali Chakrabarti, Shuvra Bhattacharyya, Bruno Bougard
2009 IEEE Signal Processing Magazine  
M ulticore processors are now prevalent in all major domains of signal processing. Many laptop and desktop computers today are shipped with dualcore and even quad-core processors. The number of cores is even higher for the Sony PlayStation 3, which is equipped with an eight-core IBM CELL Broadband Engine processor, Nvidia GeForce 9800 GX2, which has 256 stream processors, and SUN UltraSPARC T1/T2 processor, which has eight cores. Technology predictions indicate that this trend will continue and
more » ... that the number of cores per processor can easily double around every two or three years. The reason why multicore architectures are the vendors' choice today can be traced back to trends in siliconprocessing technology. For several decades, technology scaling provided cheaper, faster, and more energy efficient transistors. For instance, in embedded systems, this provided an easy mechanism for achieving more computing performance and lower power consumption simultaneously. However, the "power wall" was hit at the 90-nm node. Since then, it has not been possible to increase performance at comparable power consumption levels only by technology scaling. In high-performance computing, the trend was to increase the performance through higher clock speed at the cost of power consumption. For example, from the mid-1980s to the late 1990s, the power consumption of Intel's microprocessors doubled every two to three years and reached 20 W per square centimeter. Packaging solutions turned out to be more expensive than the integrated circuits themselves. It became imperative to pursue a different direction to increase performance. Interestingly, for a given processor architecture in a given technology, the power consumption decreases faster than the performance when the clock rate is reduced. Typically, 20% under-clocking (with lower supply voltage) yields 50% power reduction and "only" 13% performance loss. For the same power consumption, a dual-core solution clocked at 20% less would bring, in theory, 73% more performance than a single core. This trend has led to a new approach in exploiting technology scaling, where the area cost reduction obtained from scaling is used to increase the number of cores. While the challenges of designing multicore systems in hardware are many, writing efficient parallel applications that utilize the computing capability of many processing cores may require even more effort. To deliver the best performance, existing serial algorithms need to be redesigned to take advantage of the multicore computing power. This is because the best sequential algorithm is not necessarily the best parallel algorithm. Signal processing algorithm designers of the future will need to better understand the nuances of multicore computing engines. Only then can the tremendous computing power that such platforms provide be harnessed to their full potential. To give a thorough view of the area, we offer two special issues on this topic. This first special issue is aimed at providing coverage of key trends and emerging directions in architectures, design methods, software tools, and application development for design and implementation of multicore signal-processing systems. A follow-up of this issue will describe novel applications that can be enabled by platforms with multiple cores, and more extensive design examples of signal processing on platforms with multiple cores that demonstrate useful techniques for developing efficient implementations. There are a total of 11 articles in this issue. These span three thrust areas: architectures (articles 1-3), software tools and methodologies (articles 4-7), and design examples (articles 8-11). Together, these articles provide the breadth needed for a casual reader and the depth needed for a digital signal processing practitioner. The first architecture article by Blake et al. is on general-purpose multicore architectures that can be used from laptops and desktops to servers. It describes the key attributes, which include power/ performance, processing elements, memory systems, and application domains, that are common to all multicore processor implementations, and then illustrates these attributes with current and future multicore designs. Karam et al. present an overview of existing multicore DSP architectures.
doi:10.1109/msp.2009.934556 fatcat:ept7w36whzetvmi7mpp7iaue7q