On Classification Heuristics of Probabilistic System-Level Fault Diagnostic Algorithms [chapter]

Tamás Bartha, Endre Selényi
2000 Distributed and Parallel Systems  
System-level fault diagnosis of massively parallel computers requires efficient algorithms, handling a many processing elements in a heterogeneous environment. Probabilistic fault diagnosis is an approach to make the diagnostic problem both easier to solve and more generally applicable. The price to pay for these advantages is that the diagnostic result is no longer guaranteed to be correct and complete in every fault situation. In an earlier paper [2] the authors presented a novel methodology,
more » ... called local information diagnosis, and applied it to create a family of probabilistic diagnostic algorithms. This paper examines the identification of fault-free and faulty units in detail by defining three heuristic methods of fault classification and comparing the diagnostic accuracy provided by these heuristics using measurement results.
doi:10.1007/978-1-4615-4489-0_10 fatcat:fl5s27j7lfa3tkwwfybllb5srm