Guest Editorial: Emerging Technologies and Architectures for Manycore Computing Part 1: Hardware Techniques

Sebastien Le Beux, Paul V. Gratz, Ian O'Connor
2018 IEEE Transactions on Multi-Scale Computing Systems  
T HE pursuit of Moore's Law is slowing and the exploration of alternative devices is underway to replace the CMOS transistor and traditional architectures at the heart of data processing. Moreover, the emergence of stringent application constraints, particularly those linked to energy consumption, require new system architectural strategies (e.g. manycore) and real-time operational adaptability approaches. Such complex systems require new and powerful design and programming methods to ensure
more » ... imal and reliable operation. Thus, this special issue aims at collating new research along all the dimensions of emerging technologies and architectures for computing in manycores. The interest in this special issue was quite high, representing the broad interest that this special topic has within the community. Further, the range of subjects was also quite broad. To accommodate this high and broad interest we have elected to split our special issue into two issues. The first issue's focus will be on hardware oriented techniques. The next issue will focus on software related techniques. , examines a critical constraint to energy efficiency in data-parallel, clustered many-core, IoT devices, the instruction caching hierarchy. This article examines mechanisms to share the instruction cache across multiple cores in clustered many-cores. Ultimately, the article shows that, for signal processing applications, the multi-port cache architecture can improve the performance significantly with respect to traditional private cache approaches. With the scaling of many-core systems, a key component of their design will be the design of the interconnect between the cores. The correct design of this network-onchip (NoC) interconnect requires successful and detailed pre-RTL design space exploration, a key component of which is detailed, and accurate, yet low-overhead simulation of realistic workloads. In the article Illikkal present a systematic traffic modeling and generation methodology for efficient evaluation of NoC-based many-core systems. They then show that this traffic suite provides accurate and detailed traffic data with much less overhead than prior techniques. The next article, "High-Precision Performance Estimation for the Design Space Exploration of Dynamic Dataflow Programs," authored by Ma»gorzata Michalska, Simone Casale-Brunet, Endri Bezati, and Marco Mattavelli, explores the problem of mapping dynamic dataflow programs onto many/multi-core architectures. In particular, the problem of partitioning and scheduling processing elements of dynamic dataflow programs, together with their buffers and storage across a set of many-core processors is a challenge which requires accurate performance estimation. To this end, the article develops a set of performance estimation tools and heuristics to aid in the partitioning and scheduling of these programs across many-core machines. Another important challenge in scaling workloads across many nodes in future many-core architectures lies in the synchronization of data movement between the nodes. In particular, synchronization via locking mechanisms often incurs significant latency, greatly lowering the scaling potential of multi-threaded applications. Prior work has shown that speculatively eliding locks can greatly improve application scalability, however at the cost of lost performance for roll back in the event the speculation is incorrect. In "On Approximate Speculative Lock Elision," authors S. Karen Khatamifard, Ismail Akturk, and Ulya R. Karpuzcu show that locks can be elided speculatively without the need for rollback if the errors induced by the lock elision failure are bounded and the result is approximated. examine the problem of interconnecting multiple such FPGAs in with processor cores into the on-chip network. In particular, they propose distributed packet receivers and hierarchical packet senders to maintain scalability and reduce the critical path delay under a heavy task load. They further propose a technique to chain accelerators together via the on die interconnect. With increased scaling of many-core processors, heterogeneity has become a key tool to achieve energy efficiency in application execution, as evidenced by recent big.LITTLE designs from ARM and other manufacturers. "CHOAMP: Cost Based Hardware Optimization for Asymmetric Multicore Processors" by authors Jyothi Krishna Viswakaran Sreelatha, Shankar Balachandran, and Rupesh Nasre, examines the problem of optimizing the core-thread mapping in asymmetric many-core designs for best performance and energy-efficiency. In particular, they develop a probabilistic method which models program construct behavior under different run constraints, and uses this model to choose the cost function of choice to determine this mapping at compile time. further illustrates the importance of the interconnect with the continuing scaling of many-core processors. In this work, the authors develop a novel, new adaptive routing algorithm inspired by the A-star search algorithm. This approach allows the simultaneous co-optimization of both latency and throughput in their adaptive routing technique even in the presence of faults. Research along all of the dimensions of emerging technologies and architectures for computing in manycores is of great and growing importance as semiconductors scale through the final generations of CMOS process technology. This topic area is exceptionally active and vast. Here, we present a sampling of techniques in this area with a focus on hardware oriented techniques, which we hope will foster new and innovated approaches to research in the community. We further hope the reader will return for the second half of our special issue focusing on software techniques, to be released in the next issue of the IEEE Transactions on Multi-Scale Computing Systems. S ebastien Le Beux Paul V. Gratz Ian O'Connor Guest Editors ACKNOWLEDGMENTS
doi:10.1109/tmscs.2018.2826758 fatcat:3w6uto7qovfaxpfozlvrsq5oxe