The resilience wall: Cross-layer solution strategies

Subhasish Mitra, Pradip Bose, Eric Cheng, Chen-Yong Cher, Hyungmin Cho, Rajiv Joshi, Young Moon Kim, Charles R. Lefurgy, Yanjing Li, Kenneth P. Rodbell, Kevin Skadron, James Stathis (+1 others)
2014 Proceedings of Technical Program - 2014 International Symposium on VLSI Technology, Systems and Application (VLSI-TSA)  
Resilience to hardware failures is a key challenge for a large class of fu ture computing systems that are constrained by the so-called power wall: from embedded systems to supercomputers. Today's mainstream computing systems typically assume that transistors and interconnects operate correctly during useful system lifetime. With enormous complexity and significantly increased vulnerability to fa ilures compared to the past, fu ture system designs cannot rely on such assumptions. At the same
more » ... ons. At the same time, there is explosive growth in our dependency on such systems. To overcome this outstanding challenge, this paper advocates and examines a cross-layer resilience approach. Two major components of this approach are: L System and software-level effects of circuit-level faults are considered from early stages of system design; and, 2. resilience techniques are implemented across multiple layers of the system stack -from circuit and architecture levels to runtime and applications -such that they work together to achieve required degrees of resilience in a highly energy-efficient manner. Illustrative examples to demonstrate key aspects of cross-layer resilience are discussed. SEMU rate to SEU rate ratio for silicon-on-insulator (SOI) CMOS
doi:10.1109/vlsi-tsa.2014.6839639 fatcat:gb7lqfyrhbel3nyda6jaqknbqq