ExtraVirt

Dominic Lucchetti, Steven K. Reinhardt, Peter M. Chen
2005 Proceedings of the twentieth ACM symposium on Operating systems principles - SOSP '05  
Reliability is becoming an increasingly important issue in modern processor design. Smaller feature sizes and more numerous transistors are projected to increase the frequency of transient faults [4, 5]. Our project, ExtraVirt, leverages the trend toward multi-core and multi-processor systems to survive these transient faults. Our goals are (1) to add fault tolerance without modifying existing operating systems, applications or hardware, (2) to minimize the time spent executing software that
more » ... not tolerate faults, and (3) to minimize the time and space overhead needed to detect and recover from faults. We accomplish these goals by leveraging virtual-machine technology and by sharing memory and I/O devices across replicas. ExtraVirt extends prior work on VM-level fault tolerance[2] by detecting and recovering from non-fail-stop faults and by running multiple replicas efficiently on a single machine. Detecting and recovering from processor faults requires running multiple replicas, comparing their outputs before they go to external devices (e.g., network, disk, monitor), and correcting faulty replicas before their output becomes visible. The unit of replication in ExtraVirt is a virtual machine; this accomplishes our first goal of enabling fault tolerance without modifying existing operating systems, applications or hardware [2]. ExtraVirt keeps replicated virtual machines consistent in the presence of non-deterministic input and events through virtual-machine logging and replay[2, 3], leaving only processor faults as non-determined input. Any divergence in output can thus be attributed to a processor fault. ExtraVirt manages its replicas and additional functionality through extensions residing in the Xen virtual-machine monitor[1]. Our second goal is to maximize the system's tolerance of faults by minimizing the time spent executing software that is not replicated. Since only software above the replicationmanagement layer (RML) is replicated automatically, this goal dictates that we locate the RML below as much soft-Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SOSP'05, October 23-26, 2005, Brighton, United Kingdom. Copyright 2005 ACM 1-59593-079-5/05/0010 ...$5.00. ware as possible. Implementing the RML as an extension to a virtual-machine monitor is the first step toward this goal: all operating system and application software runs in a replicated virtual machine above the RML. Even with a virtualmachine approach, however, the virtual-machine monitor and RML remain vulnerable to faults. An open research question is how to tolerate faults that occur while executing outside the automatically replicated software. One possible approach being the use of a compiler based approach for replication of execution within the hypervisor [6]. Our third goal is to minimize time and space overhead needed to detect and recover from faults. To minimize time overhead, we leverage the fact that detecting a fault requires only two replicas, while identifying which replica is faulty requires a third. Thus when the two replicas used for detection diverge ExtraVirt dynamically create a third replica by replaying from a prior, known-good state. ExtraVirt periodically creates a known-good state by stopping the replicas at an identical point in their executions and verifying that their states remain identical. To minimize the memory overhead of running multiple replicas, ExtraVirt shares memory between replica using a copy-on-write approach to ensure isolation. Using the copyon-write technique creates a private copy of a page when either replica modifies it and thereby allows sharing while preserving the independence of failures between replicas 1 . In the absence of faults, the memory contents of one replica at a given point of execution should match the memory contents of the other replica at the same point of execution. Ex-traVirt takes advantage of this similarity by combining opportunistically pages that have been verified to be identical between replicas[7]. ExtraVirt verifies output before sending it to external devices and so does not need to replicate disk storage, nor does it suffer increased network overhead. We are currently implementing replica management in Ex-traVirt. Open research issues include how to handle permanent faults, how to tolerate faults that occur while executing outside the automatically replicated software, and where the RML should be located relative to device drivers.
doi:10.1145/1095810.1118621 fatcat:mcb6tiwe7jbvfmo5ejcx2muek4