Software-controlled fault tolerance

George A. Reis, Jonathan Chang, Neil Vachharajani, Ram Rangan, David I. August, Shubhendu S. Mukherjee
2005 ACM Transactions on Architecture and Code Optimization (TACO)  
Traditional fault tolerance techniques typically utilize resources ineffectively because they cannot adapt to the changing reliability and performance demands of a system. This paper proposes software-controlled fault tolerance, a concept allowing designers and users to tailor their performance and reliability for each situation. Several software-controllable fault detection techniques are then presented: SWIFT, a software-only technique, and CRAFT, a suite of hybrid hardware/ software
more » ... s. Finally, the paper introduces PROFiT, a technique which adjusts the level of protection and performance at fine granularities through software control. When coupled with software-controllable techniques like SWIFT and CRAFT, PROFiT offers attractive and novel reliability options.
doi:10.1145/1113841.1113843 fatcat:ijdgvyk3izhv5ijvohyh74jluq