Real-Time Fault-Tolerance in Federated Cloud Environments

Peter Garraghan, Paul Townend, Jie Xu
2012 2012 IEEE 15th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing Workshops  
Universities of Leeds, Sheffield and York http://eprints.whiterose.ac.uk/ This is the published version of a Proceedings Paper presented at the 2012 15th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing Workshops Garraghan, P, Townend, P and Xu, J (2012) Real-time fault-tolerance in federated cloud environments. Abstract-Dependability is a critical concern in provisioning services in Cloud Computing environments. This is true when considering
more » ... ability, an attribute of dependability that is a critical and challenging problem in a Cloud context [2]. Faulttolerance is one means to attain reliability, and is typically implemented by using some form of diversity. Federated Cloud, which is an emerging Cloud paradigm that orchestrates multiple Clouds, is able to implement environmental diversity for Cloud applications with relative ease and minimal additional cost to the consumer due to its inherent design. Real-Time Applications (RTAs) can benefit from deploying fault-tolerant schemes to fulfill deadlines in the presence of faults as they enable the provisioning of correct service in the event of a component in the application failing. However, this diversity can potentially become an issue when designing dynamically scalable faulttolerant RTAs in a federated Cloud environment while also fulfilling QoS demands. In particular, building fault-tolerant RTAs by using the diversity of the Virtual Machine (VM) configurations and of the underlying Cloud infrastructure can have a negative impact on the ability to fulfill deadlines whilst still allowing the application to dynamically provision VMs with minimal human interaction. This paper identifies a number of characteristics that affect the ability for a RTA to fulfill specified deadlines in a federated Cloud environment as a result of deploying environment diverse fault-tolerant schemes. Furthermore we have designed and performed initial experiments using a real world Cloud federation to justify the feasibility of this problem. Results demonstrate that deploying RTAs in a federated Cloud environment can potentially increase the rate of deadline violations.
doi:10.1109/isorcw.2012.30 dblp:conf/isorc/GarraghanTX12 fatcat:6wqqqpf5njhevi5uv2sioxmccy