16,052 Hits in 6.2 sec

Efficient Resource Management Mechanism with Fault Tolerant Model for Computational Grids

R. Kohila
2014 International Journal of Computer Applications Technology and Research  
In the existing system, primary-backup approach is used for fault tolerance in a single environment. In this approach, each task has a primary copy and backup copy on two different processors.  ...  The proposed work is to manage the resource failures in grid job scheduling. In this method, data source and resource are integrated from different geographical environment.  ...  [3] develop a fault tolerant job scheduling strategy in order to tolerate faults gracefully in an economy based grid environment.  ... 
doi:10.7753/ijcatr0312.1004 fatcat:fyc3th7gmnaktb7d3laphsjsk4

Performance improvement in Distributed Systems through Replication and Checkpointing

Sourabh Dave, Abhishek Raghuvanshi
2012 International Journal of Computer Applications  
We have given an algorithm for replication and implemented it in Java RMI.  ...  In distributed system fault tolerance is an important issue.  ...  It is used to increase the availability of data. Replication based approach [13] is one of the famous and efficient approach for improving fault tolerance of the distributed system.  ... 
doi:10.5120/5800-8039 fatcat:mapgesopuzf6ppbqvtsaiyc2c4

Multi-Agent System for Fault Tolerance in Wireless Sensor Networks

2016 KSII Transactions on Internet and Information Systems  
Our multi-agent system consists of a resource manager, a fault tolerance manager and a load balancing manager, and we also propose fault-tolerant protocols that use multi-agent and mobile agent setups.  ...  Thus, we propose a novel multi-agent fault tolerant system for wireless sensor networks.  ...  to provide fault tolerant routing protocols for the WSNs.  ... 
doi:10.3837/tiis.2016.03.021 fatcat:j5seoy25x5flhpvymvwdekgcya

Reliability in grid computing systems

Christopher Dabrowski
2009 Concurrency and Computation SUMMARY In recent years, grid technology has emerged as an important tool for solving computeintensive problems within the scientific community and in industry.  ...  Even with standard interfaces and communications protocols in place, resource heterogeneity and dynamism will likely lead to component interactions that result in faults and failures which imperil executing  ...  Studies that have focused on fault tolerance include [103] , where a quorum-based protocol was described for managing replicated data in large-scale distributed systems, including data grids.  ... 
doi:10.1002/cpe.1410 fatcat:xih4uaq3unf7hcxa67ssxoh2jm

An efficient and scalable approach for implementing fault-tolerant DSM architectures

C. Morin, A.-M. Kermarrec, M. Banatre, A. Gefflaut
2000 IEEE transactions on computers  
The proposed solution is based on backward error recovery and consists of an extension to the existing coherence protocol to manage data used by processors for the computation and recovery data used for  ...  fault tolerance.  ...  ACKNOWLEDGMENTS We would like to thank Pete Lee for his suggestions on improving this paper.  ... 
doi:10.1109/12.859537 fatcat:4b5ka7itibai7mzqplpg56rcey

A comprehensive conceptual system-level approach to fault tolerance in Cloud Computing

Ravi Jhawar, Vincenzo Piuri, Marco Santambrogio
2012 2012 IEEE International Systems Conference SysCon 2012  
This paper introduces an innovative perspective on creating and managing fault tolerance that shades the implementation details of the reliability techniques from the users by means of a dedicated service  ...  Most existing research and implementations focus on architecture-specific solutions to introduce fault tolerance.  ...  Techniques to build efficient and fault tolerant applications for Amazon's EC2 are provided in [7] .  ... 
doi:10.1109/syscon.2012.6189503 fatcat:uhtdxwp3obd23muqwznnfesznm

Challenges facing tomorrow's datacenter

Robbert van Renesse, Rodrigo Rodrigues, Mike Spreitzer, Christopher Stewart, Doug Terry, Franco Travostino
2008 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware - LADIS '08  
Over the course of three days, attendees laid forth an ambitious research agenda that covered hot topics, ranging from fault-tolerance algorithms to performance management to cloud computing.  ...  Thus developing new failure models that tolerate such faults yet enable the construction of more efficient protocols than BFT replication is an area of future work. • What can Byzantine failure detection  ...  To address this challenge in an environment where faults are inevitable, and even commonplace, existing services deployed on datacenters rely on replication to tolerate crash faults.  ... 
doi:10.1145/1529974.1529976 fatcat:zmlphu6l6ndovnxur5znfugh24

Adaptive byzantine fault tolerance support for agent oriented systems: The BDARX

Alvi et al., Department of Computer Science and Information Technology, The University of Lahore, Lahore, Pakistan
2019 International Journal of Advanced and Applied Sciences  
Dynamic agent replication scheme (DARX) architecture is one of the most studied fault-tolerance architectures for multi-agent systems.  ...  It deals with adaptive dynamic replication schemes to make agent systems more fault tolerant, but it does not handle Byzantine faults in MAS environments.  ...  An efficient and proved way to attain fault tolerance in such systems is the application of replication strategies.  ... 
doi:10.21833/ijaas.2019.02.009 fatcat:nyjk6mn645hqne473yoz6scw4a

On Byzantine Fault Tolerance in Multi-Master Kubernertes Clusters [article]

Gor Mack Diouf, Halima Elbiaze, Wael Jaafar
2020 arXiv   pre-print
KmMR is based on the adaptation and integration of the BFT-SMaRt fault-tolerant replication protocol into Kubernetes environment.  ...  However, it requires adequate control and management via an orchestrator.  ...  number of faults tolerated by the replication protocol in place.  ... 
arXiv:1904.06206v2 fatcat:gjd4vy4buvcgxfmzso3xqpk5lm

A Hybrid Fault Tolerance System for Distributed Environment using Check Point Mechanism and Replication

S. Veera, S. Gavaskar, A. Sumithra
2017 International Journal of Computer Applications  
The efficiency of the algorithm depends on how much replication is done and upto what extent the fault tolerance has been achieved.  ...  Managing the distributed environment against the failures plays an important role nowadays. There are so many techniques evolved so far and each have their own merit and demerit.  ...  In order to replicate an object a replication protocol is used. Primary-backup replication [27] , voting [23] , and primary-per partition protocol [24] are some of the replication protocol.  ... 
doi:10.5120/ijca2017912614 fatcat:ze5kjjc2wffm7ecyzjhidrvnfi

Vigne: Towards a Self-healing Grid Operating System [chapter]

Louis Rilling
2006 Lecture Notes in Computer Science  
The VDMS is based on fault-tolerant consistency protocols allowing to replicate shared data to improve performance [2].  ...  for persistent data.  ...  Acknowledgments The author would like to thank Emmanuel Jeanvoine for his participation in designing Vigne, and Christine Morin for her valuable advices and comments.  ... 
doi:10.1007/11823285_45 fatcat:lbuxz4nocrfbda7223qbxew2fm

A hybrid fault-tolerant routing based on Gaussian network for wireless sensor network

Dung Nguyen Quoc, Niansheng Liu, Donghui Guo
2021 Journal of Communications and Networks  
The purpose of FCGW is to improve fault tolerance, increase data reliability and reduce energy consumption for wireless sensor networks.  ...  The experimental results of the proposed scheme show that FCGW protocol has high data reliability.  ...  Accordingly, the energy consumption for network management and data transmission is huge, as well as increasing data latency, which greatly affects the efficiency of fault detection and fault recovery  ... 
doi:10.23919/jcn.2021.000028 fatcat:acmbuxmpxjarxjfzhkkxkle7ne

Data Structures and Algorithms for Packet Forwarding and Classification

Sartaj Sahni
2009 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks  
for distributed shared memory p. 132 ER-TCP : an efficient fault-tolerance scheme for TCP connections p. 139 Online adaptive fault-tolerant routing in 2D torus p. 150 Replicating multithreaded  ...  Web services p. 162 Design schemes and performance analysis of dynamic rerouting interconnection networks for tolerating faults and preventing collisions p. 168 RRBS : a fault tolerance model  ... 
doi:10.1109/i-span.2009.122 dblp:conf/ispan/Sahni09 fatcat:p44oexvkrzbobdwhmrc4lkpglu

An Efficient Replicated Data Management Approach for Peer-to-Peer Systems [chapter]

J. H. Abawajy
2005 Lecture Notes in Computer Science  
In this paper, we propose an approach to address the data replication problem on P2P systems.  ...  The proposed scheme is compared with other techniques and is shown to require less communication cost for an operation as well as provide higher degree of data availability.  ...  An efficient data replication management (DRM) technique is one of the important P2P technologies.  ... 
doi:10.1007/11428862_62 fatcat:ftlzseyqrnaelhkutiavsa4ocy

Fault Tolerance Management in Cloud Computing: A System-Level Perspective

Ravi Jhawar, Vincenzo Piuri, Marco Santambrogio
2013 IEEE Systems Journal  
In this paper, we introduce an innovative, system-level, modular perspective on creating and managing fault tolerance in Clouds.  ...  Index Terms-Cloud computing, fault tolerance as a service, fault tolerance properties, system level fault tolerance.  ...  For example, the data tier of the banking service can be replicated on several storage servers such that at least one copy of the data is always available to process customer queries.  ... 
doi:10.1109/jsyst.2012.2221934 fatcat:qa444mrpwfegjpnl3tix2kxdt4
« Previous Showing results 1 — 15 out of 16,052 results