A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Estimating Silent Data Corruption Rates Using a Two-Level Model
[article]
2020
arXiv
pre-print
High-performance and safety-critical system architects must accurately evaluate the application-level silent data corruption (SDC) rates of processors to soft errors. ...
We also show that using just one of the two steps can overestimate SDC rates and produce different trends---the composition of the two is needed for accurate reliability modeling. ...
and lead to Silent Data Corruption (SDC). ...
arXiv:2005.01445v1
fatcat:f4fd2rmii5bj3lpm5fszqsrg7a
Evaluating the impact of Undetected Disk Errors in RAID systems
2009
2009 IEEE/IFIP International Conference on Dependable Systems & Networks
Our implementation enables us to model arbitrary storage systems and workloads and estimate the rate of undetected data corruptions. ...
While RAID systems have proven effective in protecting data from traditional disk failures, silent data corruption events remain a significant problem unaddressed by RAID. ...
We can estimate the rates of UDE occurrence by using a combination of the data presented in [1] and the data presented in [2] . ...
doi:10.1109/dsn.2009.5270353
dblp:conf/dsn/RozierBDHRZ09
fatcat:ol6kkftzh5cxtnvay3ysdmf4ha
System-level analysis of soft error rates and mitigation trade-off explorations
2010
2010 IEEE International Reliability Physics Symposium
This paper presents a novel system-level analysis of soft error rates (SER) based on the Transaction Level Model (TLM) of a targeted System-On-a-Chip (SoC). ...
This analysis runs 1000x faster than the conventional SoC analysis using a gatelevel model. ...
The only difference is that for silent data corruptions, we need to calculate them for each output data object. ...
doi:10.1109/irps.2010.5488685
fatcat:nd6cmzvdg5hfnfz5og5j24r3hi
Cross-Layer Resilience Against Soft Errors: Key Insights
[chapter]
2020
Embedded Systems
Such soft errors may cause malfunction of the system due to corruption of data or control flow, which may lead to unacceptable risks for life or property in safety-critical applications. ...
Here, cross-layer resilience techniques aim at finding lower cost solutions by providing accurate estimation of soft error resilience combined with a systematic exploration of protection techniques that ...
The blue bar shows the rate of silent data corruption caused when a faulty cache line is read. ...
doi:10.1007/978-3-030-52017-5_11
fatcat:sbwfuocpz5duzarrb4ysbee3cm
Understanding soft error propagation using Efficient vulnerability-driven fault injection
2012
IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012)
With CriticalFault, our results show that the injection space is reduced by 29 % and 59 % of the biased injections cause either software aborts or silent data corruptions, both are improvements from SFI ...
To evaluate, statistical fault injection (SFI) is often used to estimate the error coverage of the underlying method. ...
Specifically, an injected error that is not derated will result in a application-level or OS-level abort or a silent data corruption. ...
doi:10.1109/dsn.2012.6263923
dblp:conf/dsn/XuL12
fatcat:e7k7kwagszalxdalohl25ow6pu
The Significance of Storage in the "Cost of Risk" of Digital Preservation
2009
International Journal of Digital Curation
We review the vital role of storage and show how planning for long-term preservation of data should consider the risks involved in using digital storage technology. ...
We examine current modelling of costs and risks in digital preservation, concentrating on the Total Cost of Risk when using digital storage systems for preserving audiovisual material. ...
If the audio is sampled at 44.1 kHz (the rate used on CDs), each sample represents about 23 micro-seconds of data. ...
doi:10.2218/ijdc.v4i3.125
fatcat:uuufilkkdjg3df5mghtfra32mq
Resilient N-Body Tree Computations with Algorithm-Based Focused Recovery: Model and Performance Analysis
[chapter]
2017
Lecture Notes in Computer Science
This paper presents a model and performance study for Algorithm-Based Focused Recovery (ABFR) applied to N-body computations, subject to latent errors. ...
We make a detailed comparison with the classical Checkpoint/Restart (CR) approach. ...
corrupted data. ...
doi:10.1007/978-3-319-72971-8_8
fatcat:4j3jnyaq4fdoxli7xu5fzfymum
Modeling the Fault Tolerance Consequences of Deduplication
2011
2011 IEEE 30th International Symposium on Reliable Distributed Systems
We present a framework composed of data analysis methods and a model of data deduplication that is useful in studying the reliability impact of data deduplication. ...
The framework is useful for determining a deduplication strategy that is estimated to satisfy a set of reliability constraints supplied by a user. ...
data loss, and the impact of silent data corruptions, though the former is easily countered by using higher level RAID configurations. ...
doi:10.1109/srds.2011.18
dblp:conf/srds/RozierSZMUY11
fatcat:4z2nkaaxxredfgl35qjgt6cwxe
Bamboo ECC: Strong, safe, and flexible codes for reliable computer memory
2015
2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)
Relative to the state-of-the-art single-tier error protection, Bamboo ECC codes have superior correction capabilities, all but eliminate the risk of silent data corruption, and can also increase redundancy ...
Growing computer system sizes and levels of integration have made memory reliability a primary concern, necessitating strong memory error protection. ...
If there are two pin faults on two chips, QPC can correct both of them while AMD chipkill must report a DUE (or, in some cases, AMD chipkill results in silent data corruption). ...
doi:10.1109/hpca.2015.7056025
dblp:conf/hpca/KimSE15
fatcat:gwtcs4iibrf53pvoy76hykbbxe
We propose a novel memory protection scheme called CLEAN (Chipkill-LEvel reliable and Access granularity Negotiable), which enables us to balance the contradicting demands of fine-grained (FG) access and ...
To close a potentially significant detection coverage gap due to CLEAN's detection mechanism coupled with permanent faults, we design a simple mechanism access granularity enforcement. ...
silent data corruption events. ...
doi:10.1145/2830772.2830799
dblp:conf/micro/GongRKCE15
fatcat:mjjew46fwzgv3iuilrlfnjmkki
Addressing multiple bit/symbol errors in DRAM subsystem
[article]
2020
arXiv
pre-print
Our scheme makes use of a hash in combination with Error Correcting Code (ECC) to avoid silent data corruptions (SDCs). SSCMSD can also enhance the capability of detecting errors in address bits. ...
Current servers mostly use CHIPKILL based schemes to tolerate up-to one/two symbol errors per DRAM beat. ...
The probability of false negative is estimated by using the upper bound on SDC rate for the baseline SSC-decoder (8%) and collision probability for a N-bit hash is estimated by birthday paradox (2 −N/2 ...
arXiv:1908.01806v2
fatcat:dwti5nsgrja5dcj2ocvvqmudpm
Exploring Partial Replication to Improve Lightweight Silent Data Corruption Detection for HPC Applications
[chapter]
2016
Lecture Notes in Computer Science
Silent data corruption (SDC) poses a great challenge for high-performance computing (HPC) applications as we move to extremescale systems. ...
Accurate predictions allow us to detect corruptions when data values are far "enough" from them. ...
Introduction Silent data corruption (SDC) involves corruption to an application's memory state (including both code and data) caused by undetected soft errors, that is, errors that modify the information ...
doi:10.1007/978-3-319-43659-3_31
fatcat:toubdyjsq5b65lvizf5kx7me7e
Bit Preservation: A Solved Problem?
2010
International Journal of Digital Curation
This paper is in four parts:Claims, reviewing a typical claim of storage system reliability, showing that it provides no useful information for bit preservation purposes.Theory, proposing "bit half-life ...
For years, discussions of digital preservation have routinely featured comments such as "bit preservation is a solved problem; the real issues are ...". ...
a significant rate of silent disk errors that would lead to silent data corruption. ...
doi:10.2218/ijdc.v5i1.148
fatcat:4jrjl3kqa5d37g5inrqwazcxae
Political Risk and Real Exchange Rate: What Can We Learn from Recent Developments in Panel Data Econometrics for Emerging and Developing Countries?
2018
Journal of Quantitative Economics
We use annual data from the International Country Risk Guide database over the 1984 to 2016 period. ...
: i) countries experiencing a high degree of corruption, a high risk to investment, or a high degree of political instability tend to experience a real exchange rate depreciation, ii) there exists strong ...
The previous literature on the estimation of long-run effects using panel data allowed for the estimation of long-run effects using panel data, but it doesn't allow for cross-sectionally dependent errors ...
doi:10.1007/s40953-018-0145-4
fatcat:kzt6dukjs5fmvggqumjadi3tla
Zettabyte reliability with flexible end-to-end data integrity
2013
2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST)
Z 2 FS provides dynamical tradeoffs between performance and protection and offers Zettabyte Reliability, which is one undetected corruption per Zettabyte of data read. ...
For comparison, we implement a straightforward End-to-End ZFS (E 2 ZFS) with the same protection scheme for all components. ...
Overview The reliability of a storage system can be evaluated based on how likely corruption would occur. There are two types of corruption: detected and undetected (silent data corruption, SDC). ...
doi:10.1109/msst.2013.6558423
dblp:conf/mss/ZhangMAA13
fatcat:ky5jcx5bhjgcxch2axsk23yudm
« Previous
Showing results 1 — 15 out of 13,975 results