Adaptive fault-tolerant model for improving cloud computing performance using artificial neural network

Awatif Ragmani, Amina Elomri, Noreddine Abghour, Khalid Moussaid, Mohammed Rida, Elarbi Badidi
2020 Procedia Computer Science  
Cloud computing is a paradigm that has greatly contributed to the emergence of new applications of IT services. However, the design of dependable and efficient cloud architectures requires a high knowledge of the recurring failures causes which compromise the efficient functioning of the system. Nowadays, the number of datasets recording the real state of cloud systems still rarely available to the public. This article uses failure data published by Backblaze which is one of the largest data
more » ... rage providers. This dataset consists of the operating status of heterogeneous servers that were collected during the period between January 2015 and December 2018. Then, the data has been filtered and preprocessed to keep a total of 2,878,440 records including 128,820 failures. We investigate the correlation between hard drive parameters and failure by exploiting the attributes of Self-Monitoring, Analysis and Reporting Technology (SMART). Then, we investigate the predictive capabilities of five machine learning models including naïve Bayes and artificial neural networks (ANN) to define a failure prediction module for the cloud architecture. The experimental results demonstrate that the artificial neural network (ANN) model offers the best prediction accuracy. Abstract Cloud computing is a paradigm that has greatly contributed to the emergence of new applications of IT services. However, the design of dependable and efficient cloud architectures requires a high knowledge of the recurring failures causes which compromise the efficient functioning of the system. Nowadays, the number of datasets recording the real state of cloud systems still rarely available to the public. This article uses failure data published by Backblaze which is one of the largest data storage providers. This dataset consists of the operating status of heterogeneous servers that were collected during the period between January 2015 and December 2018. Then, the data has been filtered and preprocessed to keep a total of 2,878,440 records including 128,820 failures. We investigate the correlation between hard drive parameters and failure by exploiting the attributes of Self-Monitoring, Analysis and Reporting Technology (SMART). Then, we investigate the predictive capabilities of five machine learning models including naïve Bayes and artificial neural networks (ANN) to define a failure prediction module for the cloud architecture. The experimental results demonstrate that the artificial neural network (ANN) model offers the best prediction accuracy.
doi:10.1016/j.procs.2020.03.106 fatcat:xjgpxnbjxjbyxgxfnvv7ugzabq