Filters








17 Hits in 4.2 sec

BulletProof: A Defect~Tolerant CMP Switch Architecture

K. Constantinides, S. Plaza, J. Blome, Bin Zhang, V. Bertacco, S. Mahlke, T. Austin, M. Orshansky
The Twelfth International Symposium on High-Performance Computer Architecture, 2006.  
We select one small aspect of a typical chip multiprocessor (CMP) system to study in detail, a single CMP router switch.  ...  We find that designs are attainable that can tolerate a larger number of defects with less overhead than naïve triple-modular redundancy, using domain-specific techniques such as end-to-end error detection  ...  Further, a CMP switch is only a first step towards the over-reaching goal of designing a defect-tolerant CMP system.  ... 
doi:10.1109/hpca.2006.1598108 dblp:conf/hpca/ConstantinidesPBZBMAO06 fatcat:zoigoavtl5g55pjhras3t2etzq

Architecting a reliable CMP switch architecture

Kypros Constantinides, Stephen Plaza, Jason Blome, Valeria Bertacco, Scott Mahlke, Todd Austin, Bin Zhang, Michael Orshansky
2007 ACM Transactions on Architecture and Code Optimization (TACO)  
Our goal is to design a BulletProof CMP switch architecture capable of tolerating significant levels of various types of defects.  ...  We select one small aspect of a typical chip multiprocessor (CMP) system to study in detail, a single CMP router switch.  ...  ACKNOWLEDGMENTS We would like to acknowledge Li-Shiuan Peh for providing us access to CMP switch models, Doug Burger for providing CMP network reference traces, and the anonymous reviewers for providing  ... 
doi:10.1145/1216544.1216545 fatcat:2hzj7orxjnd7ldc6tsa2vc3mpa

Reliable Systems on Unreliable Fabrics

Todd Austin, Valeria Bertacco, Scott Mahlke, Yu Cao
2008 IEEE Design & Test of Computers  
Figure 5 . 5 BulletProof defect-tolerant pipeline: overview (a), system architecture (b), and measured results (c). Periodic online testing identifies defects at runtime.  ...  on a commercial CMP based on Sun's Niagara, the technique provided defect tolerance for 99.2% of the chip area with only a 5.8% area overhead.  ... 
doi:10.1109/mdt.2008.107 fatcat:ykmurvstufcvrevyigxcyp2qhu

StageNet: A Reconfigurable Fabric for Constructing Dependable CMPs

Shantanu Gupta, Shuguang Feng, Amin Ansari, Scott Mahlke
2011 IEEE transactions on computers  
Our results show that the proposed SN architecture can perform 40 percent more cumulative work compared to a traditional CMP over 12 years of its lifetime.  ...  To this end, this paper presents and evaluates a highly reconfigurable CMP architecture, named as StageNet (SN), that is designed with reliability as its first-class design criteria.  ...  ., the US National Science Foundation grant CCF-0347411, and the Gigascale Systems Research Center, one of five research centers funded under the Focus Center Research Program, a Semiconductor Research  ... 
doi:10.1109/tc.2010.205 fatcat:xs7oqbxxdnfb7nhze27oknj2jq

StageNetSlice

Shantanu Gupta, Shuguang Feng, Amin Ansari, Jason Blome, Scott Mahlke
2008 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems - CASES '08  
We use this study to motivate the design of StageNet, an embedded CMP architecture designed from its inception with reliability as a first class design constraint.  ...  A naive slice design results in approximately 4X slowdown verses a traditional processor due to longer communication delays in the pipeline.  ...  ., the National Science Foundation grant CCF-0347411, and the Gigascale Systems Research Center, one of five research centers funded under the Focus Center Research Program, a Semiconductor Research Corporation  ... 
doi:10.1145/1450095.1450099 dblp:conf/cases/GuptaFABM08 fatcat:mygmv363rnfb7atfe3ew544ddy

Savior: A Reliable Fault Resilient Router Architecture for Network-on-Chip

Ayaz Hussain, Muhammad Irfan, Naveed Khan Baloch, Umar Draz, Tariq Ali, Adam Glowacz, Larisa Dunai, Jose Antonino-Daviu
2020 Electronics  
The proposed router architecture achieves the highest Silicon Protection Factor (SPF) metric, which is 24.8 as compared to the state-of-the-art fault-tolerant architectures.  ...  (VA) and multiple paths for switch allocator (SA) and crossbar (XB).  ...  In [15] , the author proposes a defect-tolerant CMP switch architecture named BulletProof. They employed a generic model of the bathtub curve for permeant fault models.  ... 
doi:10.3390/electronics9111783 fatcat:4ycrkynwbfcm7dqtugvjiwiq7u

Necromancer

Amin Ansari, Shuguang Feng, Shantanu Gupta, Scott Mahlke
2010 Proceedings of the 37th annual international symposium on Computer architecture - ISCA '10  
Although a faulty core cannot be trusted to correctly execute programs, we observe in this work that for most defects, when starting from a valid architectural state, execution traces on a defective core  ...  This defect tolerance and throughput enhancement comes at modest area and power overheads of 5.3% and 8.5%, respectively.  ...  Prior work on defect tolerance mostly focused on on-chip caches since there is less homogeneity in the non-cache parts of a core, making defect tolerance a more challenging issue.  ... 
doi:10.1145/1815961.1816024 dblp:conf/isca/AnsariFGM10 fatcat:fayojr7wabga5chpvgzificza4

Necromancer

Amin Ansari, Shuguang Feng, Shantanu Gupta, Scott Mahlke
2010 SIGARCH Computer Architecture News  
Although a faulty core cannot be trusted to correctly execute programs, we observe in this work that for most defects, when starting from a valid architectural state, execution traces on a defective core  ...  This defect tolerance and throughput enhancement comes at modest area and power overheads of 5.3% and 8.5%, respectively.  ...  Prior work on defect tolerance mostly focused on on-chip caches since there is less homogeneity in the non-cache parts of a core, making defect tolerance a more challenging issue.  ... 
doi:10.1145/1816038.1816024 fatcat:n5ppvna6orhchf2pnilmkptdom

ReliNoC: A reliable network for priority-based on-chip communication

M R Kakoee, V Bertacco, L Benini
2011 2011 Design, Automation & Test in Europe  
Our network leverages a dual physical channel switch architecture which removes the control overhead of virtual channels (VCs) and utilizes the inherent redundancy within the 2-channel switch to provide  ...  Synthesis results show that our reliable architecture incurs only 13% area overhead on the baseline 2-channel switch. 978-3-9810801-7-9/ DATE11/ c 2011 EDAA  ...  In the ReliNoC switch, we take advantage of the redundant components in switches and use them as replacements in presence of faults. Our architecture can tolerate several faults in the network.  ... 
doi:10.1109/date.2011.5763112 dblp:conf/date/KakoeeBB11 fatcat:2sgnpkzgpfby7a6d6tu7betfje

Viper

Andrea Pellegrini, Joseph L. Greathouse, Valeria Bertacco
2012 SIGARCH Computer Architecture News  
This is done using distributed control logic, which avoids a single point of failure by construction. Viper can tolerate a high number of permanent faults due to its inherent redundancy.  ...  We estimate that fault rates higher than one permanent faults per 12 million transistors, on average, cause the throughput of a classic CMP design to fall below that of a Viper design of similar size.  ...  In addition, we acknowledge the support of the Gigascale Systems Research Center, one of five research centers funded under the Focus Center Research Program, a Semiconductor Research Corporation program  ... 
doi:10.1145/2366231.2337199 fatcat:j6xbgpfjdbcirlcq4bvhggxgni

Viper: Virtual pipelines for enhanced reliability

Andrea Pellegrini, Joseph L. Greathouse, Valeria Bertacco
2012 2012 39th Annual International Symposium on Computer Architecture (ISCA)  
This is done using distributed control logic, which avoids a single point of failure by construction. Viper can tolerate a high number of permanent faults due to its inherent redundancy.  ...  We estimate that fault rates higher than one permanent faults per 12 million transistors, on average, cause the throughput of a classic CMP design to fall below that of a Viper design of similar size.  ...  In addition, we acknowledge the support of the Gigascale Systems Research Center, one of five research centers funded under the Focus Center Research Program, a Semiconductor Research Corporation program  ... 
doi:10.1109/isca.2012.6237030 dblp:conf/isca/PellegriniGB12 fatcat:qoqd4nh3ejfzfgurkzseeye3ly

The StageNet fabric for constructing resilient multicore systems

Shantanu Gupta, Shuguang Feng, Amin Ansari, Jason Blome, Scott Mahlke
2008 2008 41st IEEE/ACM International Symposium on Microarchitecture  
Our results show that the proposed SN architecture can perform nearly 50% more cumulative work compared to a traditional multicore.  ...  To this end, this paper presents and evaluates a highly reconfigurable multicore architecture, named StageNet (SN), that is designed with reliability as its first class design criteria.  ...  ., the National Science Foundation grant CCF-0347411, and the Gigascale Systems Research Center, one of five research centers funded under the Focus Center Research Program, a Semiconductor Research Corporation  ... 
doi:10.1109/micro.2008.4771786 dblp:conf/micro/GuptaFABM08 fatcat:b4kfnszd6bce5lqwwlnuumfl7e

NoCGuard: A Reliable Network-on-Chip Router Architecture

Muhammad Akmal Shafique, Naveed Khan Baloch, Muhammad Iram Baig, Fawad Hussain, Yousaf Bin Zikria, Sung Won Kim
2020 Electronics  
To deal with these reliability challenges, this research proposed NoCGuard, a reconfigurable architecture designed to tolerate multiple permanent faults in each pipeline stage of the generic router.  ...  NoCGuard router architecture uses four highly reliable and low-cost fault-tolerant strategies.  ...  This led to the design of chip multi-processors (CMP) or multi-core architectures with high performance and low power consumption [2] .  ... 
doi:10.3390/electronics9020342 fatcat:c4bqddrzlnefxmw7fa53nj6jba

Design of a Near-Ideal Fault-Tolerant Routing Algorithm for Network-on-Chip-Based Multicores [article]

Costas Iordanou, Vassos Soteriou, Konstantinos Aisopos
2020 arXiv   pre-print
Aiming towards seamless NoC operation in the presence of faulty links we propose Hermes, a near-ideal fault-tolerant routing algorithm that meets the objectives of exhibiting high levels of robustness,  ...  operating in a distributed mode, guaranteeing freedom from deadlocks, and evening-out traffic, among many.  ...  Even an isolated intra-router defect or a sole link failure can morph a regular topology into an arbitrary one with an unanticipated geometry.  ... 
arXiv:2006.11025v1 fatcat:22xo5gml3ndjjj63bk3pnma6nq

A Holistic Solution for Reliability of 3D Parallel Systems [article]

Javad Bagherzadeh, University, My
2021
Moreover, due to the infancy of the manufacturing process, high variation, and defect densities, chip designers are not encouraged to consider these emerging technologies as a stand-alone replacement for  ...  By leveraging 3D fabric layouts, it proposes the underlying architecture to efficiently repair the system in the presence of faults.  ...  Recently, 64 parallel processor cores with stacked memory [23] and a large-scale 3D CMP with a cluster-based near-threshold computing architecture [24] have been demonstrated by academia.  ... 
doi:10.7302/1545 fatcat:fscjrcibkvcrnew6dsnqg7zmda
« Previous Showing results 1 — 15 out of 17 results