A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Fast crash recovery in RAMCloud
2011
Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles - SOSP '11
RAMCloud is a DRAM-based storage system that provides inexpensive durability and availability by recovering quickly after crashes, rather than storing replicas in DRAM. ...
scale to recover quickly after crashes. ...
Bigtable, like RAMCloud, implements fast crash recovery (during which data is unavailable) rather than online replication. ...
doi:10.1145/2043556.2043560
dblp:conf/sosp/OngaroRSOR11
fatcat:iglpm5pr55eajbwylbjhpebxe4
The RAMCloud Storage System
2015
ACM Transactions on Computer Systems
RAMCloud's crash recovery mechanism harnesses the resources of the entire cluster working concurrently so that recovery performance scales with cluster size. 7:2 J. Ousterhout et al. ...
The log-structured approach also simplifies crash recovery and utilizes DRAM twice as efficiently as traditional storage allocators such as malloc. ...
In addition, fast crash recovery requires fast failure detection, and the system must deal with secondary errors that occur during recovery. ...
doi:10.1145/2806887
fatcat:fg3r5yahbjhxhcor6m2w2q6bxy
Exploiting Commutativity For Practical Fast Replication
[article]
2017
arXiv
pre-print
In RAMCloud, CURP improved write latency by ~2x (13.8 us -> 7.3 us) and write throughput by 4x. ...
This strategy allows most operations to complete in 1 RTT (the same as an unreplicated system). We implemented CURP in the Redis and RAMCloud storage systems. ...
CURP can be used with RAMCloud without sacrificing its fast crash recovery [15] ). ...
arXiv:1710.09921v1
fatcat:ox5t6b2jmnfi3cy4mvczjwydt4
An Empirical Evaluation of How the Network Impacts the Performance and Energy Efficiency in RAMCloud
2017
2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)
Through a study carried on RAMCloud, we focus on two settings: 1) clients are collocated within the same network as the storage servers (with Infiniband interconnects); 2) clients access the servers from ...
In-memory storage systems emerged as a de-facto building block for today's large scale Web architectures and Big Data processing frameworks. ...
This enables RAMCloud to harness large-scale to enable fast crash recovery. ...
doi:10.1109/ccgrid.2017.127
dblp:conf/ccgrid/TalebIAC17
fatcat:sqfmdzvgcncfffylubnc6zgviy
A SURVEY ON CLOUD STORAGE – IMPACT OF STORAGE ON THE PERFORMANCE
2018
International Journal of Research in Engineering and Technology
In case of disasters, cloud storage can help in very quick recovery of data. Bandwidth usage can also be reduced, by sharing access-links instead of the complete files. ...
This work aims to assist the reader in proper selection of architecture based on the types of operation the user of the architecture intends to have in his/her application. ...
Each recovery master generates the hash table from log-structured data which is later merged. The key to fast recovery is utilizing the scale of the RAMCloud cluster. ...
doi:10.15623/ijret.2018.0710007
fatcat:q54suv7n6jcanieoqwotohdl7u
The case for RAMCloud
2011
Communications of the ACM
With scalable high-performance storage entirely in DRAM, RAMCloud will enable a new breed of data-intensive applications. by John ousterhout, Parag agrawal, david erickson, christos kozyrakis, Jacob leverich ...
RAMCloud stores all of its information in the main key insights the web has driven development of new large-scale applications that have effectively scaled compute power and storage capacity but have not ...
In any case, all these technologies are similar in that they provide fast access to small chunks of data. ...
doi:10.1145/1965724.1965751
fatcat:nmp2qlgjfvcivhx3fwegyegvuq
Taming uncertainty in distributed systems with help from the network
2015
Proceedings of the Tenth European Conference on Computer Systems - EuroSys '15
Network and process failures cause complexity in distributed applications. ...
The research was supported in part by NSF grants CNS-1055057, CNS-1040083, and CCF-1048269. ...
Specifically, RAMCloud detects failures using a short timeout of hundreds of milliseconds; if the coordinator times out on a master, the coordinator starts data recovery, which is very fast. ...
doi:10.1145/2741948.2741976
dblp:conf/eurosys/LenersGAW15
fatcat:5enktayvpvgkvddmvwmrjm24dm
Replication-Based Fault-Tolerance for Large-Scale Graph Processing
2014
2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
This paper observes that the vertex replicas created for distributed graph computation can be naturally extended for fast in-memory recovery of graph states. ...
in-memory reconstruction of failed vertices from replicas in other machines. ...
This work is supported in part by Doctoral ...
doi:10.1109/dsn.2014.58
dblp:conf/dsn/WangZCCG14
fatcat:vfuicg3rqrf3lbivizr52ggwr4
Stateless Network Functions
2015
Proceedings of the 2015 ACM SIGCOMM Workshop on Hot Topics in Middleboxes and Network Function Virtualization - HotMiddlebox '15
In this paper, we propose that network functions should be similarly redesigned to be stateless. ...
Our Click-based prototype integrates with RAMCloud; using NAT as an example network function, we demonstrate that we are able to create stateless network functions that maintain the desired performance ...
ACKNOWLEDGEMENTS This work was funded in part by the following grants: NSF NeTS 1320389 and NSF XPS 1337399. ...
doi:10.1145/2785989.2785993
dblp:conf/sigcomm/KablanCHJK15
fatcat:grtdkb4xezemlbjymjjifu6kdm
DXRAM's Fault-Tolerance Mechanisms Meet High Speed I/O Devices
[article]
2018
arXiv
pre-print
But, when storing the data in RAM on thousands of servers one has to consider server failures. Only a few in-memory key-value stores provide automatic online recovery of failed servers. ...
The most prominent example of these systems is RAMCloud. Another system with sophisticated fault-tolerance mechanisms is DXRAM which is optimized for small data objects. ...
A fast reorganization is important to keep a constant write throughput (provide enough free space for writes) and to allow a fast crash recovery (less invalid/outdated objects to process). ...
arXiv:1807.03562v2
fatcat:p4yobou5vjgqrn4lvdumamuelq
Assise: Performance and Availability via NVM Colocation in a Distributed File System
[article]
2020
arXiv
pre-print
To demonstrate this, we built the Assise distributed file system, based on a persistent, replicated coherence protocol for managing a set of server-colocated PMMs as a fast, crash-recoverable cache between ...
Fail-over and Recovery Assise caches file system state with persistence in local NVM, which it can use for fast recovery. Assise optimizes recovery performance according to crash prevalence. ...
RAMcloud requires a full-bisection bandwidth network for fast recovery. Assise leverages colocated NVM for recovery and does not require full-bisection bandwidth or asynchronous backup storage. ...
arXiv:1910.05106v2
fatcat:3sjpue3tqzd3haqnh4ka72fezi
FluidMem: Memory as a Service for the Datacenter
[article]
2017
arXiv
pre-print
In this paper, we present FluidMem, a complete system to realize disaggregated memory in the datacenter. ...
Disaggregating resources in data centers is an emerging trend. ...
As an example, RAMCloud provides crash-recovery, tolerating node failures without loss of availability to the data store. ...
arXiv:1707.07780v1
fatcat:thnnbfklg5bmxgnwg4ngtddoly
LogBase
2012
Proceedings of the VLDB Endowment
In this paper, we introduce LogBase -a scalable log-structured database system that adopts log-only storage for removing the write bottleneck and supporting fast system recovery. ...
Writeahead-logging is a common approach for providing recovery capability while improving performance in most storage systems. ...
Acknowledgments This work was in part supported by the Singapore MOE Grant No. R252-000-454-112. ...
doi:10.14778/2336664.2336673
fatcat:afskwwel3zb77hzsxqedjwtmay
LogBase: A Scalable Log-structured Database System in the Cloud
[article]
2012
arXiv
pre-print
In this paper, we introduce LogBase - a scalable log-structured database system that adopts log-only storage for removing the write bottleneck and supporting fast system recovery. ...
Write-ahead-logging is a common approach for providing recovery capability while improving performance in most storage systems. ...
Acknowledgments This work was in part supported by the Singapore MOE Grant No. R252-000-454-112. ...
arXiv:1207.0140v1
fatcat:ek6r2lr36bfg3hhg7dp6pqxhma
Stretching Multi-Ring Paxos
[article]
2015
arXiv
pre-print
., independent Paxos instances), a large number of replicas in a ring, and a global deployment. ...
We also report on the performance of recovery under peak load and present two novel extensions to boost Multi-Ring Paxos's performance. ...
To recover the data fast RAMCloud relies on the collective force of thousands of servers. ...
arXiv:1504.04942v1
fatcat:ipncz5mh7jb3flmqfg3deghvyu
« Previous
Showing results 1 — 15 out of 67 results