23 Hits in 6.3 sec

A Flexible Fault-Tolerance Mechanism for the Integrade Grid Middleware

Stanley Araujo de Sousa, Francisco Jose da Silva e Silva, Rafael Fernandes Lopes
2007 International Conference on Networking and Services (ICNS '07)  
The dynamic nature of the grid infrastructure, its high scalability, and great heterogeneity exacerbates the likelihood of errors occurrence, imposing fault tolerance as a major requirement for grid middlewares  ...  This paper describes a flexible fault-tolerance mechanism implemented on Integrade grid middleware that allows the customization of several failure handling parameters and the combination of different  ...  [8] present a flexible fault tolerance framework for grids environments.  ... 
doi:10.1109/icns.2007.7 dblp:conf/icns/SousaSL07 fatcat:xqvagp2iorgdtnr5mm53eoqnny

Adaptive fault tolerance mechanisms for opportunistic environments: a mobile agent approach

V. G. Pinheiro, A. Goldman, F. Kon
2011 Concurrency and Computation  
MAG includes retrying, replication and checkpointing as fault tolerance techniques; they operate independently from each other and they are not capable of detecting changes on resource availability.  ...  The mobile agent paradigm has emerged as a promising alternative to overcome the construction challenges of opportunistic grid environments.  ...  In this work, we implemented dynamic fault tolerance mechanisms based on task replication and checkpoints for grid applications.  ... 
doi:10.1002/cpe.1706 fatcat:mc3ykxntungt7fg6cyyoadx22m

Efficient Parallel Application Execution on Opportunistic Desktop Grids [chapter]

Francisco Silva, Fabio Kon, Daniel Batista, Alfredo Goldman, Fabio Costa, Raphael Camargo
2012 Grid Computing - Technology and Applications, Widespread Coverage and New Horizons  
Fault tolerance, that comprises a major requirement for grid middleware as grid environments are highly prone to failures, a characteristic amplified on opportunistic grids due their dynamism and the use  ...  Due to the heterogeneity, high scalability and dynamism of the execution environment, providing efficient support for application execution on opportunist grids comprises a major challenge for middleware  ...  , and output data using a reliable and fault-tolerant storage device or service, which is called stable storage.  ... 
doi:10.5772/35599 fatcat:aibrgyyqkzcm5dyqvdgd2nqvqy

Application execution management on the InteGrade opportunistic grid middleware

Francisco José da Silva e Silva, Fabio Kon, Alfredo Goldman, Marcelo Finger, Raphael Y. de Camargo, Fernando Castor Filho, Fábio M. Costa
2010 Journal of Parallel and Distributed Computing  
The contributions cover the related fields of application scheduling, execution management, and fault tolerance.  ...  The InteGrade project is a multi-university effort to build a novel grid computing middleware based on the opportunistic use of resources belonging to user workstations.  ...  Finally, fault tolerance comprises a major requirement for grid middleware as grid environments are highly prone to failures, a characteristic amplified on opportunistic grids due their dynamism and the  ... 
doi:10.1016/j.jpdc.2010.01.010 fatcat:slvleblnkjgddhzraij5fmwt5i

Reliable management of checkpointing and application data in opportunistic grids

Raphael Y. de Camargo, Fernando Castor, Fabio Kon
2010 Journal of the Brazilian Computer Society  
Our middleware enables the reliable distributed storage of application data in the shared machines in a redundant and fault-tolerant way.  ...  Our evaluation shows that the proposed middleware promotes important improvements in grid data management reliability while imposing a low performance overhead.  ...  We extended InteGrade to use the unused disk space of the shared grid machines to store this data in a reliable and fault-tolerant way.  ... 
doi:10.1007/s13173-010-0016-0 fatcat:z3ygxivblfcp3aelx7pkkfd2je

Strategies for Checkpoint Storage on Opportunistic Grids

R.Y. de Camargo, F. Kon, R. Cerqueira
2006 IEEE Distributed Systems Online  
This article evaluates several strategies for storing checkpoint data in an opportunistic grid environment, including replication, parity information, and erasure coding.  ...  This evaluation compares the computational overhead, storage overhead, and degree of fault tolerance of these strategies. IEEE Distributed Systems Online (vol. 7, no. 9), art. no. 0609-o9001  ...  Acknowledgments A grant from CNPq, Brazil (process no. 55.2028/02-9) supported this work.  ... 
doi:10.1109/mdso.2006.56 fatcat:pk5pywq5bndpjlnbpg6676spqi

Strategies for storage of checkpointing data using non-dedicated repositories on Grid systems

Raphael Y. de Camargo, Renato Cerqueira, Fabio Kon
2005 Proceedings of the 3rd international workshop on Middleware for grid computing - MGC '05  
We consider the tradeoff among computational overhead, storage overhead, and degree of fault-tolerance of these strategies.  ...  Instead, we want to use the shared Grid nodes to store application data in a distributed fashion.  ...  A distributed storage system must ensure scalability and fault-tolerance for the storage, management, and recovery of application data.  ... 
doi:10.1145/1101499.1101500 dblp:conf/middleware/CamargoCK05 fatcat:qgpdyfbu3bhu5lqd6zrs2hd3zq

A Taxonomy and Survey of Grid Resource Planning and Reservation Systems for Grid Enabled Analysis Environment [article]

Arshad Ali, Ashiq Anjum, Atif Mehmood, Richard McClatchey, Ian Willers, Julian Bunn, Harvey Newman, Michael Thomas, Conrad Steenberg
2018 arXiv   pre-print
Management of resources in the Grid environment becomes complex as the resources are geographically distributed, heterogeneous in nature and owned by different individuals and organizations each having  ...  The concept of coupling geographically distributed resources for solving large scale problems is becoming increasingly popular forming what is popularly called grid computing.  ...  This frameworks allows for implicit security and fault-tolerance 4.5.  ... 
arXiv:cs/0407012v2 fatcat:uqvrkjixebhaxlwyqdzwiwskna

Adaptive Resource Sharing in a Web Services Environment [chapter]

Vijay K. Naik, Swaminathan Sivasubramanian, Sriram Krishnan
2004 Lecture Notes in Computer Science  
Web services and grid based technologies hold promise for developing such middleware.  ...  Our approach leverages both the grid and the web services based technologies and overcomes the limitations of existing solutions by providing an additional layer of middleware.  ...  As we noted earlier, a Grid request can be executed in multiple Grid nodes for reasons of fault-tolerance.  ... 
doi:10.1007/978-3-540-30229-2_17 fatcat:dlrpdoahxnbilhhlgfdcpeym3m

European DataGrid Project: Experiences of Deploying a Large Scale Testbed for E-science Applications [chapter]

Fabrizio Gagliardi, Bob Jones, Mario Reale, Stephen Burke
2002 Lecture Notes in Computer Science  
world-wide data and computational Grid on a scale not previously attempted.  ...  To address these problems we are building on emerging computational Grid technologies to establish a research network that is developing the technology components essential for the implementation of a  ...  The authors would like to thank the entire EU DataGrid project for contributing most of the material for this article.  ... 
doi:10.1007/3-540-45798-4_20 fatcat:tdpu52ajvzfo5lkpbduiqocvke

Design and Implementation of a Middleware for Data Storage in Opportunistic Grids

Raphael Y. De Camargo, Fabio Kon
2007 Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)  
In this paper, we present the design and implementation of OppStore, a middleware that provides reliable distributed data storage using the free disk space from shared grid machines.  ...  The system utilizes a two-level peer-to-peer organization to connect grid machines in a scalable and faulttolerant way.  ...  In this storage mode, the system stores the data only in the local cluster and can use IDA or data replication to provide fault-tolerance.  ... 
doi:10.1109/ccgrid.2007.37 dblp:conf/ccgrid/CamargoK07 fatcat:zxa5iiohdnccji6mmvjtfx4kpq

User-friendly and reliable grid computing based on imperfect middleware

Rob V. van Nieuwpoort, Thilo Kielmann, Henri E. Bal
2007 Proceedings of the 2007 ACM/IEEE conference on Supercomputing - SC '07  
This paper describes the Java Grid Application Toolkit (Java-GAT) that provides a high-level, middleware-independent and siteindependent interface to the grid.  ...  The JavaGAT uses nested exceptions and intelligent dispatching of method invocations to handle errors and to automatically select suitable grid middleware implementations for requested operations.  ...  and incomplete middleware and fault tolerance when designing a high-level grid API.  ... 
doi:10.1145/1362622.1362668 dblp:conf/sc/NieuwpoortKB07 fatcat:n7kwp5cvbjdrbiklibv2wsenwa

A taxonomy and survey on autonomic management of applications in grid computing environments

Mustafizur Rahman, Rajiv Ranjan, Rajkumar Buyya, Boualem Benatallah
2011 Concurrency and Computation  
Thus, AC provides a holistic approach for the development of systems/applications that can adapt themselves to meet requirements of performance, fault tolerance, reliability, security, Quality of Service  ...  In Grid computing environments, the availability, performance, and state of resources, applications, services, and data undergo continuous changes during the life cycle of an application.  ...  Nimrod-G Nimrod/G [58] is a widely adopted Grid middleware environment for building and managing large computational experiments over distributed resources.  ... 
doi:10.1002/cpe.1734 fatcat:rgaqyybmmvdufphqfcr47ymhoe

A modular meta-scheduling architecture for interfacing with pre-WS and WS Grid resource management services

Eduardo Huedo, Rubén S. Montero, Ignacio M. Llorente
2007 Future generations computer systems  
In the medium term and until a full transition is accomplished, both pre-WS and WS GRAM services will coexist in Grid infrastructures.  ...  Such functionality is demonstrated on a infrastructure that comprises resources from a research testbed, based on the Globus Toolkit 4.0, and the EGEE production infrastructure, based on the LCG middleware  ...  Other qualitative metrics, like security and fault tolerance, are considered crucial for a successful Grid infrastructure [26] .  ... 
doi:10.1016/j.future.2006.07.013 fatcat:cjmdynaqdjfnrjsjcyron5cllm

Design, implementation, and performance of an automatic configuration service for distributed component systems

Fabio Kon, Jeferson Roberto Marques, Tomonori Yamane, Roy H. Campbell, M. Dennis Mickunas
2005 Software, Practice & Experience  
The problem behind all these difficulties is the lack of a unified model for representing dependencies and mechanisms for dealing with these dependencies.  ...  Current systems rely heavily on manual configuration by users and system administrators. This is tolerable now, when users have to manage a few computers.  ...  Herbert Yutaka Watanabe was responsible for the J2EE implementation of the Automatic Configuration Service.  ... 
doi:10.1002/spe.654 fatcat:m44r2gttevhzvkjz2nlmnk5ksm
« Previous Showing results 1 — 15 out of 23 results