Filters








21 Hits in 12.5 sec

A Mostly-Clean DRAM Cache for Effective Hit Speculation and Self-Balancing Dispatch

Jaewoong Sim, Gabriel H. Loh, Hyesoon Kim, Mike OConnor, Mithuna Thottethodi
2012 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture  
By keeping the majority of the DRAM cache clean, most HMP predictions do not need to be verified, and the self-balancing dispatch has more opportunities to redistribute requests (i.e., only requests to  ...  The second is a Self-Balancing Dispatch (SBD) mechanism that dynamically sends some requests to the off-chip memory even though the request may have hit in the die-stacked DRAM cache.  ...  Acknowledgments We would like to thank Andreas Moshovos, the Georgia Tech HPArch members, and the anonymous reviewers for their suggestions and feedback.  ... 
doi:10.1109/micro.2012.31 dblp:conf/micro/SimLKOT12 fatcat:gcmnluolizda7hiwrho2j4pek4

The RAMCloud Storage System

John Ousterhout, Mendel Rosenblum, Stephen Rumble, Ryan Stutsman, Stephen Yang, Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro, Seo Jin Park (+1 others)
2015 ACM Transactions on Computer Systems  
In many cases, DRAM is used as a cache for some other storage system, such as a database; this approach forces developers to manage consistency between the cache and the backing store, and its performance  ...  RAMCloud uses a unique two-level approach to log cleaning, which maximizes DRAM space utilization while minimizing I/O bandwidth requirements for secondary storage. Latency.  ...  The latency for memcached requests was around 300μs, and the overall hit rate for data in memcached was about 96.5%.  ... 
doi:10.1145/2806887 fatcat:fg3r5yahbjhxhcor6m2w2q6bxy

Message from the Program Co-chairs

2010 2010 19th IEEE Asian Test Symposium  
hardware • Performance issues with modern on-die messaging facilities and caching infrastructures This proceedings include 14 papers from 5 symposium sessions.  ...  In continuation of a successful series of events, the 4th symposium of the Many-core Applications Research Community (MARC) took place at the Hasso Plattner Institute for Software Systems Engineering (  ...  and implementing the software-based L2 cache flushing.  ... 
doi:10.1109/ats.2010.6 fatcat:iadg4ce5pbernnmwnjzkj4tnba

Message from the Program Co-Chairs

2013 2013 International Conference on Computer and Robot Vision  
hardware • Performance issues with modern on-die messaging facilities and caching infrastructures This proceedings include 14 papers from 5 symposium sessions.  ...  In continuation of a successful series of events, the 4th symposium of the Many-core Applications Research Community (MARC) took place at the Hasso Plattner Institute for Software Systems Engineering (  ...  and implementing the software-based L2 cache flushing.  ... 
doi:10.1109/crv.2013.5 fatcat:qahsqru4wbdsjeksn742a4fyiy

Dependable embedded systems

2008 2008 6th IEEE International Conference on Industrial Informatics  
Titles in the Series cover a focused set of embedded topics relating to traditional computing devices as well as hightech appliances used in newer, personal devices, and related topics.  ...  This Series addresses current and future challenges pertaining to embedded hardware, software, specifications and techniques.  ...  Hideharu Amano at Keio University and its partnering institutions. It was a tremendous help to see to possibilities of FDSOI in silicon very early on.  ... 
doi:10.1109/indin.2008.4618103 fatcat:hal6brsgsjg5rlo3u5xil46pxi

Aegis: A single-chip secure processor

G. Suh, Charles O'Donnell, Srinivas Devadas
2007 IEEE Design & Test of Computers  
AEGIS, with its off-chip protection mechanisms, is slower than traditional processors by 26% on average for large applications and by a few percent for embedded applications.  ...  Physical random functions provide a cheap and secure way of generating a unique secret key on each processor, which enables a remote party to authenticate the processor chip.  ...  A clean solution for all the problems related to the virtual-to-physical address mapping is to use virtually-addressed caches.  ... 
doi:10.1109/mdt.2007.4343587 fatcat:qzwlnqrklvat5kgjzia7yed47q

Variability Mitigation in Nanometer CMOS Integrated Systems: A Survey of Techniques From Circuits to Software

Abbas Rahimi, Luca Benini, Rajesh K. Gupta
2016 Proceedings of the IEEE  
This article surveys challenges and opportunities in identifying variations, their effects and methods to combat these variations for improved microelectronic devices.  ...  We provide a comparative evaluation of methods for deployment across various layers of the system from circuits, architecture, to application software.  ...  Self-Timed Circuits: In self-timed circuit, or asynchronous circuit, there is no need for a clock signal to determine a starting time for a computation.  ... 
doi:10.1109/jproc.2016.2518864 fatcat:sxrsu3excbdg5p7sk4iczz262y

Graphics processing unit (GPU) programming strategies and trends in GPU computing

André R. Brodtkorb, Trond R. Hagen, Martin L. Sætra
2013 Journal of Parallel and Distributed Computing  
Explicit finite volume methods typically rely on stencil computations, making them inherently parallel, and therefore a near perfect match for the many-core graphics processing unit (GPU) found on today's  ...  There are several numerical methods for approximating the solution of hyperbolic PDEs like the shallow water equations, and finite volume methods constitute an important class.  ...  Partly accomplished at the National Center for Computational Hydroscience and Engineering, this research was also funded by the Department of Homeland Security-sponsored Souteast Region Research Initiative  ... 
doi:10.1016/j.jpdc.2012.04.003 fatcat:7s4fnkx3yrekbmxabztmto5fzq

Efficient synchronization mechanisms for scalable GPU architectures

Xiaowei Ren
2020
Third, we design HMG, a hierarchical cache coherence protocol for multi-GPU systems.  ...  The Graphics Processing Unit (GPU) has become a mainstream computing platform for a wide range of applications.  ...  Remote Loads: When a remote load arrives at the local home L2 cache, it either hits in the cache and returns data to the requester, or it misses and forwards the request to DRAM.  ... 
doi:10.14288/1.0394805 fatcat:aoyfyhwdyjbefp6yjyqk2p4tti

GPU computing architecture for irregular parallelism

Wilson Wai Lun Fung
2015
Second, Kilo TM is a cost effective, energy efficient solution for supporting transactional memory (TM) on GPUs.  ...  However, employing GPUs for applications with irregular parallelism tends to be a risky process, involving significant effort from the programmer and an uncertain amount of performance/efficiency benefit  ...  Each memory partition contains an off-chip DRAM channel, and one or more L2 cache banks that cache data from the off-chip DRAM.  ... 
doi:10.14288/1.0167110 fatcat:lk6u3fzl5fgcpnt2finxbdqfre

Use of shared memory in the context of embedded multi-core processors: exploration of the technology and its limits

Paolo <1981> Burgio, Luca Benini
2013
To tackle them, a template for a generic hardware processing unit (HWPU) is proposed, which share the memory banks with cores, and the template for a scalable architecture is shown, which integrates them  ...  In a first part, the cost of algorithms for synchronization and data partitioning are analyzed, and they are adapted to modern embedded many-cores.  ...  Figure also shows the single processor structure, which is quite simple (e.g., has neither caches nor private memory, no branch speculation, etc..) for keeping the energy and area budgets low.  ... 
doi:10.6092/unibo/amsdottorato/6187 fatcat:ioz3w2s6ifa7tbtgnwadi2phxa

Repurposing Software Defenses with Specialized Hardware

Kanad Sinha
2019
balance of the above qualities.  ...  As a result, the community has started looking elsewhere for continued protection, as attacks continue to become progressively more sophisticated.  ...  Many are to blame for getting me past the finish line. Needless to say, the biggest credit goes to my advisor, Simha. Beyond the wealth of know-  ... 
doi:10.7916/d8-e6tc-kr63 fatcat:5mmez4ypdzfqffukip6xzaotve

Dundalk 1900–1960: An Oral History

Charles Flynn
2001 Irish Economic and Social History  
There was a benevolent parental attitude and th e y had playing fields and tennis courts for the w orkers and th ey encouraged dram a.  ...  He'd come in drunk and they'd start a row and she'd lift the [knife] and she'd say hit me hit me. And he'd be that drunk he would hit her. Dr.  ...  You couldn't just open a pawn broking' business, you had to apply to the courts for a... for a. . a licence (Charlie prompts 'licence').  ... 
doi:10.1177/033248930102800109 fatcat:vtw4cuj2pvfc3knfiwlcr5ze5e

Flash-aware Database Management Systems

Sergej Hardock
2020
As a result, the effective I/O throughput and longevity expectations of SSDs are significantly lower than those of Flash memory encapsulated in these SSDs.  ...  Flash SSDs are becoming the primary storage technology for single servers and large data centers.  ...  cache hits.  ... 
doi:10.25534/tuprints-00014476 fatcat:bhbbyleitnb6vhywjbf7khdp7q

A cycle-accurate joulemeter for CMOS VLSI circuits

Eunseok Song, In-Chan Choi, Young-Kil Park, Soo-Ik Chae
ESSCIRC 2004 - 29th European Solid-State Circuits Conference (IEEE Cat. No.03EX705)  
As a driver application, we consider wide-area computing substrates for ambient intelligent systems which provide an unexplored hardware platform for executing distributed applications under strict energy  ...  Power dissipation has become a critical design concern in recent years, driven by the increased levels of complexity and emergence of mobile applications.  ...  Jane Liu for her valuable comments and suggestions for the extension of the proposed analysis.  ... 
doi:10.1109/esscirc.2003.1257211 fatcat:2tminukn2jfhjezaakh7plck7y
« Previous Showing results 1 — 15 out of 21 results