Filters








55 Hits in 9.3 sec

Three-dimensional Integrated Circuits: Design, EDA, and Architecture

Guangyu Sun
2011 Foundations and Trends® in Electronic Design Automation  
, which could be further categorized as wafer-to-wafer, die-to-wafer, or die-to-die stacking methods.  ...  The emerging three-dimensional (3D) integration technology is one of the promising solutions to overcome the barriers in interconnect scaling, thereby offering an opportunity to continue performance improvements  ...  The most straightforward way is to implement a vertical bus across the layers of stacked DRAM layers to the processor cores [93] [94] [95] .  ... 
doi:10.1561/1000000016 fatcat:usmthkco4rfavmnlvvmmgxolcq

Enabling interposer-based disintegration of multi-core processors

Ajaykumar Kannan, Natalie Enright Jerger, Gabriel H. Loh
2015 Proceedings of the 48th International Symposium on Microarchitecture - MICRO-48  
We consider how the routing resources afforded by the interposer can be used to improve the network-on-chip's (NoC) capabilities and use the interposer to "disintegrate" a multi-core chip into smaller  ...  Silicon interposers enable high-performance processors to integrate a significant amount of in-package memory, thereby providing huge bandwidth gains while reducing the costs of accessing memory.  ...  I have learned a lot from her expertise in computer  ... 
doi:10.1145/2830772.2830808 dblp:conf/micro/KannanJL15 fatcat:o7jke4oiurgknanh7mngkv6hdm

Wafer-level 3D integration technology

S. J. Koester, A. M. Young, R. R. Yu, S. Purushothaman, K.-N. Chen, D. C. La Tulipe, N. Rana, L. Shi, M. R. Wordeman, E. J. Sprogis
2008 IBM Journal of Research and Development  
The basic reasoning for pursuing 3D integration is presented, followed by a description of the possible process variations and integration schemes, as well as the process technology elements needed to  ...  Detailed descriptions of two wafer-level integration schemes implemented at IBM are given, and the challenges of bringing 3D integration into a production environment are discussed.  ...  Acknowledgments We thank the following people for their contributions to this work: Steven Steen  ... 
doi:10.1147/jrd.2008.5388565 fatcat:tqj6r7zsmjabthub4l6qxigy54

A Retrospective and Futurespective of Rowhammer Attacks and Defenses on DRAM [article]

Zhi Zhang, Jiahao Qi, Yueqiang Cheng, Shijie Jiang, Yiyang Lin, Yansong Gao, Surya Nepal, Yi Zou, Jiliang Zhang, Yang Xiang
2022 arXiv   pre-print
In particular, most industrial solutions have turned out to be ineffective against rowhammer while on-die ECC's susceptibility to rowhammer calls for a comprehensive study.  ...  In this paper, we systematize rowhammer attacks and defenses with a focus on DRAM.  ...  Dynamic Skewed Hash Tree: Vig et al. [70] propose a lightweight scheme within the MC to check data integrity.  ... 
arXiv:2201.02986v2 fatcat:72hvl7xgsrerpndekuyoq5mg2a

Addressing Variability in Reuse Prediction for Last-Level Caches [article]

Priyank Faldu
2020 arXiv   pre-print
Last-Level Cache (LLC) represents the bulk of a modern CPU processor's transistor budget and is essential for application performance as LLC enables fast access to data in contrast to much slower main  ...  To that end, we propose two cache management techniques, one domain-agnostic and one domain-specialized, to improve cache efficiency by addressing variability in reuse prediction.  ...  Data-dependent irregular access patterns, combined with the sparse distribution of hot vertices, make it di cult for existing domain-agnostic predictive techniques in reliably identifying, and, in turn  ... 
arXiv:2006.08487v1 fatcat:4stqscurbbcb3am5lw33c4pksm

Memory leads the way to better computing

H.-S. Philip Wong, Sayeef Salahuddin
2015 Nature Nanotechnology  
The goal of the study was to assay the state of the art, and not to either propose a potential system or prepare and propose a detailed roadmap for its development.  ...  I am honored to have been part of this study, and wish to thank the study members for their passion for the subject, and for contributing far more of their precious time than they expected. Peter M.  ...  Only recently have stacked die-level DRAM devices been in highvolume production.  ... 
doi:10.1038/nnano.2015.29 pmid:25740127 fatcat:d6iiuuwcozbxlgn4kxxzdzwd4m

A 1920 $\times $ 1080 30-frames/s 2.3 TOPS/W Stereo-Depth Processor for Energy-Efficient Autonomous Navigation of Micro Aerial Vehicles

Ziyun Li, Qing Dong, Mehdi Saligane, Benjamin Kempke, Luyao Gong, Zhengya Zhang, Ronald Dreslinski, Dennis Sylvester, David Blaauw, Hun-Seok Kim
2018 IEEE Journal of Solid-State Circuits  
A dependence-resolving scan associated with 16-stage deep pipeline is introduced to hide the data dependence between neighboring pixels in the SGM algorithm.  ...  We exploit inherent data parallelism in the algorithm by processing 128 local disparity costs and aggregating the SGM costs along four paths for all 128 disparities in parallel.  ...  ACKNOWLEDGMENT The authors would like to thank TSMC University Shuttle Program for chip fabrication.  ... 
doi:10.1109/jssc.2017.2751501 fatcat:t7fxuqfzcrf3npde6gzmkpnc2e

Architectural Techniques for Improving NAND Flash Memory Reliability [article]

Yixin Luo
2018 arXiv   pre-print
and write-cold data. (2) We propose a new framework that learns an online flash channel model for each chip and enables four new flash controller algorithms to improve flash reliability by up to 69.9%  ...  We aim to improve flash reliability with a multitude of low-cost architectural techniques.  ...  cell design, and vertically stacks dozens of silicon layers in a single chip.  ... 
arXiv:1808.04016v1 fatcat:fotned4yajc2xmaoezwjdrgypu

An Inductive-Coupling Inter-Chip Link for High-Performance and Low-Power 3D System Integration [chapter]

Kiichi Niitsu, Tadahiro Kuro
2010 Solid State Circuits Technologies  
Acknowledgements This work has been in part supported by the Grant-in-Aid for JSPS fellows and the Central Research Laboratory of Hitachi Limited.  ...  The power efficiency is improved by narrowing a transmission data pulse to 180ps.  ...  Besides, in order to achieve low-power operation, synchronous scheme is utilized in an inductive-coupling link.  ... 
doi:10.5772/6885 fatcat:v3qdnxph5fap7mi4mslout7m6y

2020 Index IEEE Transactions on Circuits and Systems II: Express Briefs Vol. 67

2020 IEEE Transactions on Circuits and Systems - II - Express Briefs  
Computations in 1Transistor-1Resistor Memory Arrays; TCSII Dec. 2020 3347-3351 Jalali, M., see Kabirpour, S., TCSII Feb. 2020 250-254 Jalali, M., see Kari Dolatabadi, A., TCSII Oct. 2020 1740-1744  ...  ., Robust Output Regulation in Discrete-Time Singular Systems With Actuator Saturation and Uncertainties; 340-344 Jagabar Sathik, M., Sandeep, N., Almakhles, D., and Blaabjerg, F., Cross Connected Compact  ...  ., +, TCSII July 2020 1184-1188 A 0.83-pJ/Bit 6.4-Gb/s HBM Base Die Receiver Using a 45° Strobe Phase for Energy-Efficient Skew Compensation.  ... 
doi:10.1109/tcsii.2020.3047305 fatcat:ifjzekeyczfrbp5b7wrzandm7e

Enhancing Programmability, Portability, and Performance with Rich Cross-Layer Abstractions [article]

Nandita Vijaykumar
2019 arXiv   pre-print
While there is abundant research, and thus significant improvements, at different levels of the stack that address these very challenges, in this thesis, we observe that we are fundamentally limited by  ...  We propose 4 different approaches to designing richer abstractions between the application, system software, and hardware architecture in different contexts to significantly improve programmability, portability  ...  The rows reserved for duplicate data can also be used to map weak cells in DRAM to reduce refresh overheads and improve reliability.  ... 
arXiv:1911.05660v1 fatcat:w5f3g4isqbcphm2jjfzjtvrjnq

High Performance Datacenter Networks: Architectures, Algorithms, and Opportunities

Dennis Abts, John Kim
2011 Synthesis Lectures on Computer Architecture  
Acknowledgments First we would like to thank Mark Hill and Michael Morgan for having invited us to write a synthesis lecture and for their support. Many thanks to reviews from Tor M. Aamodt  ...  They may, for instance, restructure control behaviors of their kernel to improve SIMD utilization, reduce the number of bank conflicts to shared memory by skewing static memory allocations, or modifying  ...  In the stack-based reconvergence system, the streaming multiprocessor uses a hardware stack structure to record the join location and the next fetch address.  ... 
doi:10.2200/s00341ed1v01y201103cac014 fatcat:rjpziqdnezdrnhfiygrg3jdz4m

Performance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU)

Hyesoon Kim, Richard Vuduc, Sara Baghsorkhi, Jee Choi, Wen-mei Hwu
2012 Synthesis Lectures on Computer Architecture  
Acknowledgments First we would like to thank Mark Hill and Michael Morgan for having invited us to write a synthesis lecture and for their support. Many thanks to reviews from Tor M. Aamodt  ...  They may, for instance, restructure control behaviors of their kernel to improve SIMD utilization, reduce the number of bank conflicts to shared memory by skewing static memory allocations, or modifying  ...  In the stack-based reconvergence system, the streaming multiprocessor uses a hardware stack structure to record the join location and the next fetch address.  ... 
doi:10.2200/s00451ed1v01y201209cac020 fatcat:ll4uas6lmjbcll5zqzomhcv5vq

Algorithms and Data Structures [chapter]

2009 Practical Guide to Computer Simulations  
In this dissertation we propose algorithms and data structures that are efficient and robust with respect to different hardware factors.  ...  In a streaming setting, algorithms are restricted to access data sequentially, while having at their disposal a working memory that can be accessed for free, but which is usually much smaller than the  ...  Unfortunately, these improvements come at the cost of reliability [113, 114] .  ... 
doi:10.1142/9789812836632_0004 fatcat:5w2ix5xm35he7hv4oliyxcckzm

Algorithms and Data Structures [chapter]

2008 Robustness and Usability in Modern Design Flows  
In this dissertation we propose algorithms and data structures that are efficient and robust with respect to different hardware factors.  ...  In a streaming setting, algorithms are restricted to access data sequentially, while having at their disposal a working memory that can be accessed for free, but which is usually much smaller than the  ...  Unfortunately, these improvements come at the cost of reliability [113, 114] .  ... 
doi:10.1007/978-1-4020-6536-1_3 fatcat:vksli6i5wjgfbfrg3ifeqxlkeu
« Previous Showing results 1 — 15 out of 55 results