Post-silicon bug diagnosis with inconsistent executions

Andrew DeOrio, Daya Shanker Khudia, Valeria Bertacco
2011 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)  
The complexity of modern chips intensifies verification challenges, and an increasing share of this verification effort is shouldered by post-silicon validation. Focusing on the first silicon prototypes, post-silicon validation poses critical new challenges such as intermittent failures, where multiple executions of a same test do not yield a consistent outcome. These are often due to on-chip asynchronous events and electrical effects, leading to extremely time-consuming, if not unachievable,
more » ... g diagnosis and debugging processes. In this work, we propose a methodology called BPS (Bug Positioning System) to support the automatic diagnosis of these difficult bugs. During post-silicon validation, lightweight BPS hardware logs a compact encoding of observed signal activity over multiple executions of the same test: some passing, some failing. Leveraging a novel post-analysis algorithm, BPS uses the logged activity to diagnose the bug, identifying the approximate manifestation time and critical design signals. We found experimentally that BPS can localize most bugs down to the exact root signal and within about 1,000 clock cycles of their occurrence.
doi:10.1109/iccad.2011.6105414 dblp:conf/iccad/DeOrioKB11 fatcat:bbzbb3jjkngfllw47dev6dvof4