Filters








149 Hits in 4.8 sec

Increasing the trustworthiness of commodity hardware through software

Kevin Elphinstone, Yanyan Shen
2013 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)  
We explore leveraging multicore processors to provide redundancy, and report the results of our initial performance investigation.  ...  Such an operating system could potentially consolidate safety and security critical software on a single device where previously multiple devices were used.  ...  ACKNOWLEDGEMENTS NICTA is funded by the Australian Government as represented by the Department of Broadband, Communications and the Digital Economy and the Australian Research Council through the ICT Centre  ... 
doi:10.1109/dsn.2013.6575328 dblp:conf/dsn/ElphinstoneS13 fatcat:2uvav7y6h5djbipmlmgy7ip6jy

Architectures for online error detection and recovery in multicore processors

D Gizopoulos, M Psarakis, S V Adve, P Ramachandran, S K S Hari, D Sorin, A Meixner, A Biswas, X Vera
2011 2011 Design, Automation & Test in Europe  
It discusses taxonomy of representative approaches and presents a qualitative comparison based on: hardware cost, performance overhead, types of faults detected, and detection latency.  ...  Extremely complex, massively parallel, multi-core processor chips fabricated in these technologies will become more vulnerable to: (a) environmental disturbances that produce transient (or soft) errors  ...  With the advent of multiple on-chip threads in simultaneous multithreading (SMT) and chip multiprocessor (CMP) architectures, hardware redundancy techniques such as dual modular redundancy (DMR) and triple  ... 
doi:10.1109/date.2011.5763096 dblp:conf/date/GizopoulosPARHSMBV11 fatcat:uli4r7onhrd5tatt4l2soy4pom

Survey of fault tolerance techniques for shared memory multicore/multiprocessor systems

Hamid Mushtaq, Zaid Al-Ars, Koen Bertels
2011 2011 IEEE 6th International Design and Test Workshop (IDT)  
We classify fault tolerance into four different steps which are proactive fault management, error detection, fault diagnosis and recovery and discuss related work for each step, with focus on techniques  ...  We also highlight the additional difficulties in tolerating faults for parallel execution on shared memory multicore/multiprocessor systems.  ...  INTRODUCTION It has become possible to integrate billions of transistors on a single die with modern nano-scale technology and therefore allow many processing cores to be implemented on the same chip.  ... 
doi:10.1109/idt.2011.6123094 dblp:conf/idt/MushtaqAB11 fatcat:hkc5aszrdjg73m7jzvkq6irmf4

Fault Diagnosis and Reconfiguration Method for Network-on-Chip Based Multiple Processor Systems with Restricted Private Memories

Masashi IMAI, Tomohiro YONEDA
2013 IEICE transactions on information and systems  
If a fault is detected by mismatches, the fault is identified and isolated using a TMR (Triple Module Redundancy) and the system is reconfigured by the redundant processor cores.  ...  We propose a fault diagnosis and reconfiguration method based on the Pair and Swap scheme to improve the reliability and the MTTF (Mean Time To Failure) of network-on-chip based multiple processor systems  ...  This work is supported by CREST (Core Research for Evolutional Science and Technology) of JST (Japan Science and Technology Agency).  ... 
doi:10.1587/transinf.e96.d.1914 fatcat:b7ujcahfnva3fptedq2pp5nxai

Configurable isolation

Nidhi Aggarwal, Parthasarathy Ranganathan, Norman P. Jouppi, James E. Smith
2007 Proceedings of the 34th annual international symposium on Computer architecture - ISCA '07  
Chip multiprocessors with an abundance of identical resources like cores, cache and interconnection networks would appear to be ideal building blocks for implementing high availability solutions on chip  ...  We propose a new chip multiprocessor architecture that provides configurable isolation for fault containment and component retirement, based upon costeffective modifications to commodity designs.  ...  Redundant processor hardware, employing dual (or triple) modular redundancy -DMR (TMR) -can be applied at different granularities.  ... 
doi:10.1145/1250662.1250720 dblp:conf/isca/AggarwalRJS07 fatcat:qjus5bljgbflvabp43blc6tx7a

Configurable isolation

Nidhi Aggarwal, Parthasarathy Ranganathan, Norman P. Jouppi, James E. Smith
2007 SIGARCH Computer Architecture News  
Chip multiprocessors with an abundance of identical resources like cores, cache and interconnection networks would appear to be ideal building blocks for implementing high availability solutions on chip  ...  We propose a new chip multiprocessor architecture that provides configurable isolation for fault containment and component retirement, based upon costeffective modifications to commodity designs.  ...  Redundant processor hardware, employing dual (or triple) modular redundancy -DMR (TMR) -can be applied at different granularities.  ... 
doi:10.1145/1273440.1250720 fatcat:zgqn5nwiyffebbsgpmpb4hvbr4

Multi Processor Micro-Controllers for Automotive Safety-Critical Applications

Massimo Baleani, Leonardo Mangeruca, Maurizio Peri, Saverio Pezzini
2004 IFAC Proceedings Volumes  
Fault-tolerant electronic sub-systems are becoming a standard requirement as electronics becomes more and more pervasive in present cars.  ...  In this paper we present multi-processor micro-controller architectures devised within the Platform-Based Design framework to address fault-tolerant automotive applications.  ...  We would like to thank Alberto Ferrari for his contributions to platform-based design methodology and for his pivotal role in the conception and design of the multi-processor platforms.  ... 
doi:10.1016/s1474-6670(17)30319-1 fatcat:iux4p7gt5babdo3yqx6wfotqaq

A Survey of Fault-Tolerance Techniques for Embedded Systems from the Perspective of Power, Energy, and Thermal Issues

Sepideh Safari, Mohsen Ansari, Heba Khdr, Pourya Gohari Nazari, Sina Yari-Karin, Amir Yeganeh-Khaksar, Shaahin Hessabi, Alireza Ejlali, Jorg Henkel
2022 IEEE Access  
Specifically, fault-tolerance techniques employ some kind of redundancies to satisfy specific reliability requirements.  ...  High temperature, in turn, accelerates transistor aging mechanisms, which may ultimately lead to permanent faults on the chip.  ...  FAULT-TOLERANCE TECHNIQUES Faults in computer systems are classified into transient, intermittent, and permanent based on their occurrence and duration [16] , [17] . • Transient faults: This type of  ... 
doi:10.1109/access.2022.3144217 fatcat:lktvlcw6szhw7osggwfyaqnn3a

Review Paper on Fault Tolerant Scheduling in Multicore System

2018 VFAST Transactions on Software Engineering  
In this paper, it was discussed about various fault tolerant task scheduling Algorithm for the multicore system based on hardware and software.  ...  Blend of triple module redundancy and double module redundancy considering Agricultural vulnerability factor other than EDF and LLF scheduling algorithms were used to create hardware-based algorithm.  ...  Bell et al have presented that faults are detected by using the redundant execution on a chip of multiprocessor with no impact on performance.  ... 
doi:10.21015/vtse.v13i2.509 fatcat:4nxrmvlkprdrncejeo2mfthzay

Heterogeneous Concurrent Error Detection (hCED) Based on Output Anticipation

Naveed Imran, Ronald F. Demara
2011 2011 International Conference on Reconfigurable Computing and FPGAs  
A conventional Concurrent Error Detection (CED) technique usually relies on two exact replicas of a given module to provide redundancy in fault-tolerant systems.  ...  In the paper, we discuss two forms of the heterogeneous structure which are spatial and temporal redundancy based.  ...  A better approach in terms of fault capacity is Triple Modular Redundancy (TMR) based design in which three instances of a module concurrently operate on the same input.  ... 
doi:10.1109/reconfig.2011.48 dblp:conf/reconfig/ImranD11 fatcat:mhlbv65x65ctbnpcvvpgnpqqh4

Fault and timing analysis in critical multi-core systems: A survey with an avionics perspective

Andreas Löfwenmark, Simin Nadjm-Tehrani
2018 Journal of systems architecture  
determinism, and higher sensitivity to permanent and transient faults due to shrinking transistor sizes.  ...  This paper reviews major contributions that assess the impact of fault tolerance on worst-case execution time of processes running on a multi-core platform.  ...  NFFP6-2013-01203 and NFFP7-2017-04890.  ... 
doi:10.1016/j.sysarc.2018.04.001 fatcat:74tk5j6kyjfmxpufn3x7dph6ve

A fault-tolerant architecture for parallel applications in tiled-CMPs

Daniel Sánchez, Juan L. Aragón, José M. García
2011 Journal of Supercomputing  
Previous proposals for providing fault detection and recovery have been mainly based on redundant execution over different cores.  ...  RMT (Redundant Multi-Threading) is a family of techniques based on SMT (Simultaneous Multi-Threading) processors in which two independent threads (master and slave), fed with the same inputs, redundantly  ...  This work has been jointly supported by the Spanish MEC and European Commission FEDER funds under grants "Consolider Ingenio-2010 CSD2006-00046" and "TIN2009-14475-C04-02".  ... 
doi:10.1007/s11227-011-0670-9 fatcat:edkmwysf2ray5a253g7quirtiy

Methods for fault tolerance in networks-on-chip

Martin Radetzki, Chaochao Feng, Xueqian Zhao, Axel Jantsch
2013 ACM Computing Surveys  
Research on fault-tolerant Networks-on-Chip tries to mitigate partial failure and its effect on network performance and reliability by exploiting various forms of redundancy at the suitable network layers  ...  Networks-on-Chip constitute the interconnection architecture of future, massively parallel multiprocessors that assemble hundreds to thousands of processing cores on a single chip.  ...  Next we move to the core of the survey by reviewing the techniques developed for detecting and tolerating faults in Networks-on-Chip.  ... 
doi:10.1145/2522968.2522976 fatcat:3t4b3rhbgbc2bphjevpkzlpm6u

A Survey of fault models and fault tolerance methods for 2D bus-based multi-core systems and TSV based 3D NOC many-core systems [article]

Shashikiran Venkatesha, Ranjani Parthasarathi
2022 arXiv   pre-print
The Through silicon via based 3D Network on chip is the prospective solution for integrating many cores on single die.  ...  The article presents an elaborate discussion on fault models, failure mechanisms, resilient 3D routers, defect tolerance methods for the TSV based 3D NOC many-core systems.  ...  Triple Modular Redundancy (TMR) and Dual Modular Redundancy (DMR) methods are like special case of classical n-modular redundancy systems.  ... 
arXiv:2203.07830v1 fatcat:dsbx3o4v3femhi5d6kfrurzuoi

Towards scalable reliability frameworks for error prone CMPs

Joseph Sloan, Rakesh Kumar
2009 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems - CASES '09  
The in-network fault tolerance router utilizes the expected redundancy in vote messages, to reduce some of the blocking overhead incurred at the leader, and also provide a mechanism to trade-off network  ...  This makes voting latency and bandwidth significant performance bottlenecks for such systems. In this paper, we present a scalable NMR framework for error prone chip multiprocessors(CMPs).  ...  While core-level dual-modular redundancy (DMR) and triple-modular redundancy (TMR) have been shown to be effective when errors are rare, a large amount of core-level redundancy will be required for attaining  ... 
doi:10.1145/1629395.1629432 dblp:conf/cases/SloanK09 fatcat:7fvrqzyotvflplkjhjcafm5dyy
« Previous Showing results 1 — 15 out of 149 results