Parameter Prediction in Fault Management Framework

Thanyalak CHALERMARREWONG, Simon See, Tiranee Achalakul
2012 Proceedings of The International Symposium on Grids and Clouds (ISGC) 2012 — PoS(ISGC 2012)   unpublished
High performance computing systems can have high failure rates as they feature a large number of servers and components with intensive workload. The availability of the system can be easily compromised if the failure of these subsystems is not handled correctly. To ensure an availability of the computing resources, there is a need for an effective fault management framework. This research proposes a strategy to preserve system's availability focusing on a prediction model. An ARMA model is used
more » ... ARMA model is used to be a parameter prediction method of the framework. The main idea is to create an effective prediction model focusing on hardware failure. System parameters associated to hardware fault are input of our prediction model. This model uses prior data to predict future data. Each predicted parameter then will be used to predict availability of the system. Experiments show the effectiveness of this model and how to find appropriate interval of periodically gather data.
doi:10.22323/1.153.0005 fatcat:e3eovrh5hjeafmrukyvvk6esoe