84 Hits in 3.7 sec

On Modeling the Lifetime Reliability of Homogeneous Manycore Systems

Lin Huang, Qiang Xu
2008 2008 14th IEEE Pacific Rim International Symposium on Dependable Computing  
In this work, we model the lifetime reliability of homogeneous manycore systems using a load-sharing nonrepairable k-out-of-n:G system with general failure distributions for embedded cores.  ...  We then use the proposed model to analyze the impact of different redundant schemes and configurations on the lifetime reliability of manycore systems.  ...  For standby redundant systems, an embedded core can be a "spare" unit, and such systems involve both warm standby (wait state) and cold standby (spare state) units as well as processing cores.  ... 
doi:10.1109/prdc.2008.23 dblp:conf/prdc/HuangX08 fatcat:xe2y4t7sozbclfkcbj7tbml3qi

Lifetime Reliability for Load-Sharing Redundant Systems With Arbitrary Failure Distributions

Lin Huang, Qiang Xu
2010 IEEE Transactions on Reliability  
Then, the system works in a gracefully degrading manner such that less than components share the workload, until the number of good components is less than .  ...  In this work, a general closed-form expression is presented for the lifetime reliability of load-sharing -out-of-:G hybrid redundant systems.  ...  ACKNOWLEDGMENT The authors wish to thank the associate editor, and the anonymous reviewers for their constructive comments that have helped to improve the article.  ... 
doi:10.1109/tr.2010.2048679 fatcat:dcec3zudxfdghcu6ueywejnbiq

Characterizing the lifetime reliability of manycore processors with core-level redundancy

Lin Huang, Qiang Xu
2010 2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)  
With aggressive technology scaling, integrated circuits suffer from everincreasing wearout effects and their lifetime reliability has become a serious concern for the industry.  ...  We then use the proposed model to analyze the lifetime reliability for manycore processors with various redundancy configurations.  ...  Our experiments compare the lifetimes and performances of gracefully degrading systems, processor rotation systems and standby redundant systems, under various workloads.  ... 
doi:10.1109/iccad.2010.5654250 dblp:conf/iccad/HuangX10 fatcat:3i4hbgh3sjekleyewiac4pwoxq

Phantom redundancy: a register transfer level technique for gracefully degradable data path synthesis

R. Karri, B. Iyer, I. Koren
2002 IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  
In contrast to spare-based approaches, phantom redundancy is a recovery technique that does not use any standby spares.  ...  In this paper we present an area-efficient register transfer level technique for gracefully degradable data path synthesis called phantom redundancy.  ...  The hardware models for the basic and gracefully degradable data path synthesis are shown in Figure 5 .  ... 
doi:10.1109/tcad.2002.800450 fatcat:cx4737dt5nh6nhgcxx7lv4cgyq

Multi-perspective evaluation of self-healing systems using simple probabilistic models

Rean Griffith, Gail Kaiser, Javier Alonso López
2009 Proceedings of the 6th international conference on Autonomic computing - ICAC '09  
Quantifying the efficacy of self-healing systems is a challenging but important task, which has implications for increasing designer, operator and end-user confidence in these systems.  ...  At deployment time, system integrators and operators need to understand how the selfhealing mechanisms work and how their operation impacts the system's reliability, availability and serviceability (RAS  ...  , accounting for incomplete healing and accounting for healing specific resources (spare disks, hot standbys, etc.).  ... 
doi:10.1145/1555228.1555245 dblp:conf/icac/GriffithKL09 fatcat:in7kudorrvh5pbzw5nc3afcnam

Organic embedded architecture for sustainable FPGA soft-core processors

Kening Zhang, Navid Khoshavi, Jaafar M. Alghazo, Ronald F. De Mara
2015 2015 Annual Reliability and Maintainability Symposium (RAMS)  
Innovations include autonomously degraded online throughput during regeneration, spare configuration aging and outlier driven repair assessment, and a uniform design for AEs despite the fact that they  ...  Mission-critical systems require increasing capability for fault handling and self-adaptation as their system complexities and inter-dependencies increase.  ...  reliability and power awareness.  ... 
doi:10.1109/rams.2015.7105065 fatcat:m3xqneg7xzfdrhsr532klozhbe

A Formal Model for Constraint-Based Deployment Calculation and Analysis for Fault-Tolerant Systems [chapter]

Klaus Becker, Bernhard Schätz, Michael Armbruster, Christian Buckl
2014 Lecture Notes in Computer Science  
We present an arithmetic system model with formal constraints of the deployment-problem that can be solved by a SMT-Solver. We evaluate our approach by showing an example problem and its solution.  ...  We propose a new fault-tolerant SW/HW architecture for electric vehicles with inherent safety capabilities that enable fail-operational features.  ...  This work is partially funded by the German Federal Ministry for Economic Affairs and Energy (BMWi) under grant no. 01ME12009 through the project RACE (Robust and Reliant Automotive Computing Environment  ... 
doi:10.1007/978-3-319-10431-7_15 fatcat:vfbn3wb7cjdujaepog2zjlmnpq

Optimal Design of$k$-out-of-$n$:G Subsystems Subjected to Imperfect Fault-Coverage

S.V. Amari, H. Pham, G. Dill
2004 IEEE Transactions on Reliability  
It is assumed that there exists a -out-of-:G subsystem in a nonseries-parallel system and, except for this subsystem, the redundancy configurations of all other subsystems are fixed.  ...  This paper also presents optimal design polices which maximize overall system reliability. As a special case, results are presented for -out-of-:G systems subjected to imperfect fault-coverage.  ...  The gracefully degradable systems studied in [8] are the special cases of the model with and . 2) -out-of-:G subsystems with perfect fault-coverage [10, model 1], [13, model 1] are the special cases  ... 
doi:10.1109/tr.2004.837703 fatcat:z2xs3rwhq5empcnarxskqtq4qa

Software-implemented fault-tolerance and separate recovery strategies enhance maintainability [substation automation]

G. Deconinck, V. De Florio, O. Botti
2002 IEEE Transactions on Reliability  
This framework-approach increases the availability and reliability of the application at a justifiable cost, also thanks to the re-usability of the components in different target systems.  ...  The resulting tool matches well, e.g., with current industrial requirements for embedded distributed systems, calling for adaptable and reusable software components.  ...  ACKNOWLEDGMENT The authors would like to thank the Associate Editor and anonymous referees for their useful comments.  ... 
doi:10.1109/tr.2002.1011520 fatcat:pu5x7yow2fdfznjtft6urr2mny

Dependability Modeling of Software Systems with UML and DAM: A Guide for Real-Time Practitioners

Simona Bernardi, José Merseguer, Dorina C. Petriu
2022 Software  
The modeling of system non-functional properties is a broad field. Among these properties, dependability is an important one for real-time and embedded systems.  ...  In particular, the DAM (dependability analysis and modeling) profile provides a modeling framework for dependability in the model-driven paradigm.  ...  The objective is to bring the system to a gracefully degraded state in which the basic functionalities can still be provided; therefore, activities for the replacement and/or reallocation of system components  ... 
doi:10.3390/software1020007 fatcat:fejv7yf6zrfm5jiumvpfroayoy

AgeSim: A simulation framework for evaluating the lifetime reliability of processor-based SoCs

Lin Huang, Qiang Xu
2010 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)  
This paper proposes a novel simulation framework for evaluating the lifetime reliability of processor-based system-on-a-chips (SoCs), namely AgeSim, which facilitates designers to make design decisions  ...  Unlike existing work, AgeSim can simulate failure mechanisms with arbitrary lifetime distributions and do not require to trace the system's reliability-related factors over its entire lifetime, and hence  ...  of China (NSFC) under grant No. 60876029, and in part by a grant N_CUHK417/08 from the NSFC/RGC Joint Research Scheme.  ... 
doi:10.1109/date.2010.5457238 dblp:conf/date/HuangX10 fatcat:7rgcmax4hzh2towrxqerp6zyo4

Analysis of fault tolerant computer systems

U. Sumita, J.G. Shanthikumar, Y. Masuda
1987 Microelectronics and reliability  
In Gracefully Degrading Systems, all operative units in the system are kept active for executing tasks.  ...  As described in Section 0, the gracefully degrading system reacts to a detected failure by (CL) Computational capacity before the first system failure and computational reliability Given a, let Igia be  ... 
doi:10.1016/0026-2714(87)90622-6 fatcat:fkjdb5yoajhrnovtmy673vyf5i

Resilience in Risk Analysis and Risk Assessment [chapter]

Stig Johnsen
2010 IFIP Advances in Information and Communication Technology  
Resilience is the ability of a system to react to and recover from disturbances with minimal effects on dynamic stability.  ...  close to the performance boundaries, the establishment and exploration of common mental models, the presence of flexibility in systems and organizations, and the reduction of complexity and coupling.  ...  Redundancy supports the ability of a system to degrade gracefully. Redundancy can be achieved via standby spares or through the concurrent use of multiple devices.  ... 
doi:10.1007/978-3-642-16806-2_15 fatcat:vouv5bbjujdtrgtb343qzaybnm

Survey and future directions of fault-tolerant distributed computing on board spacecraft

Muhammad Fayyaz, Tanya Vladimirova
2016 Advances in Space Research  
A set of metrics is designed and mathematical models of availability and reliability are developed, which are used to evaluate the proposed distributed computing architecture and fault management scheme  ...  In this thesis a novel cooperative task-oriented fault-tolerant distributed computing (FTDC) architecture is proposed, which caters for high performance and reliability in systems on board spacecraft.  ...  It is also evident that the reliability degrades relatively more gracefully compared to the other systems, which is desired in mission critical systems.  ... 
doi:10.1016/j.asr.2016.08.017 fatcat:szoac6aiwvbs3d2dyh5smxsgqa

Architectural core salvaging in a multi-core processor for hard-error tolerance

Michael D. Powell, Arijit Biswas, Shantanu Gupta, Shubhendu S. Mukherjee
2009 SIGARCH Computer Architecture News  
While caches, with their regular and repetitive structures, are easily covered against hard errors by providing spare arrays or spare lines, structures within a core are neither as regular nor as repetitive  ...  for many workloads and no worse performance than core disabling for the remainder.  ...  ACKNOWLEDGEMENTS The authors wish to acknowledge the work of Omer Khan whomade significant improvements to the Asim model while interning with the AMI group within SPEARS at Intel Massachusetts.  ... 
doi:10.1145/1555815.1555769 fatcat:2um5zjmejjbp5op5qxlvtattbm
« Previous Showing results 1 — 15 out of 84 results