A Hybrid Scheduling Algorithm to Achieve Fault Tolerance in Grids
International Journal of Engineering and Advanced Technology
Grid is a computational infrastructure that provides ability to securely integrate large amount of computing resources to handle workloads that are geographically dispersed. The performance of grid is usually measured based on its complex workflow, criticality and fault tolerance property. Formally, fault tolerant is achieved by checkpointing and replication of task periodically. The major drawback with these technologies is that they produce run time overhead. To overcome the drawback, this
... e drawback, this paper proposes an algorithm that dynamically implements checkpointing and replication and provides high job throughput in the existence of failure and improves the performance of heterogeneous grids. The Simulation studies are carried out to evaluate the proposed algorithm. The results show that combined dynamic approaches improves the fault-tolerant property in the simulated grid environment. It is also inferred that the system performance depend on workload, and failure frequency.