An Improved Technique for Reducing False Alarms Due to Soft Errors

S. Kundu, I. Polian
12th IEEE International On-Line Testing Symposium (IOLTS'06)  
A significant fraction of soft errors in modern microprocessors has been reported to never lead to a system failure. Any concurrent error detection scheme that raises alarm every time a soft error is detected will not be well heeded because most of these alarms are false and responding to them will affect system performance negatively. This paper improves state of the art in detecting and preventing false alarms. Existing techniques are enhanced by a methodology to handle soft errors on address
more » ... t errors on address bits. Furthermore, we demonstrate benefit of false alarm identification in implementing a roll-back recovery system by first calculating the optimum check pointing interval for a roll-back recovery system and then showing that the optimal number of check-points decrease by orders of magnitude when exclusion techniques are used even if the implementation of exclusion technique is not perfect.
doi:10.1109/iolts.2006.10 dblp:conf/iolts/KunduP06 fatcat:aoonq5gfnndftfqgqvska3bu3e