A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Reliability of Heterogeneous Distributed Computing Systems in the Presence of Correlated Failures
2014
IEEE Transactions on Parallel and Distributed Systems
While the reliability of distributed-computing systems (DCSs) has been widely studied under the assumption that computing elements (CEs) fail independently, the impact of correlated failures of CEs on the reliability remains an open question. Here, the problem of modeling and assessing the impact of stochastic, correlated failures on the service reliability of applications running on DCSs is tackled. The service reliability is modeled using an integrated analytical and Monte-Carlo (MC)
doi:10.1109/tpds.2013.78
fatcat:4t22ps5oazgqbgtwm34udjrhdi