Filters








50 Hits in 2.6 sec

ARI

Viacheslav V. Fedorov, Sheng Qiu, A. L. Narasimha Reddy, Paul V. Gratz
2013 ACM Transactions on Architecture and Code Optimization (TACO)  
In a typical system, this boosts IPC by 4.9% on an average while decreasing energy consumption by 8.9%. These results are achieved with minimal hardware overheads.  ...  memory technology, such as Phase-Change Memory (PCM).  ...  ACKNOWLEDGMENTS The authors would like to thank Onur Mutlu and the anonymous reviewers for their insightful comments which helped to improve the quality of this manuscript.  ... 
doi:10.1145/2543697 fatcat:2qy5fp4rt5ex5pq2y47kiyj5py

A survey of power management techniques for phase change memory

Sparsh Mittal
2016 International Journal of Computer Aided Engineering and Technology  
Recently, several architecture and system-level techniques have been proposed to address this issue. In this paper, we survey several techniques for managing power consumption of PCM.  ...  The aim of this paper is to provide insights to researchers into working of PCM power-management techniques and also motivate them to propose even better techniques for designing future 'green' PCM-based  ...  [35] propose a writeback-aware bandwidth partitioning scheme for hybrid DRAM-PCM main memory system. Due to the slow writes to PCM, the bandwidth to PCM acts as a bottleneck.  ... 
doi:10.1504/ijcaet.2016.10000092 fatcat:gnowq3m4jvfuhbqx2jhc27y27m

Write-Aware Replacement Policies for PCM-Based Systems

R. Rodríguez-Rodríguez, F. Castro, D. Chaver, R. Gonzalez-Alberquilla, L. Piñuel, F. Tirado
2014 Computer journal  
In this paper we target general purpose processors provided with this kind of non-volatile main memory and we exhaustively evaluate our proposed policies in both single and multi-core environments.  ...  the last level cache (LLC) with the goal of cutting the write traffic to memory and consequently to increase PCM lifetime without degrading system performance.  ...  Figure 9 illustrates the experimental system used in both single-core and multi-core scenarios.  ... 
doi:10.1093/comjnl/bxu104 fatcat:raaxiid7ivfhbh33ll6a5pdd2q

Morphable DRAM Cache Design for Hybrid Memory Systems

Sanghoon Cha, Bokyeong Kim, Chang Hyun Park, Jaehyuk Huh
2019 ACM Transactions on Architecture and Code Optimization (TACO)  
When a small amount of fast memory is combined with slow but large memory, the cache-based organization of the fast memory can provide a SW-transparent solution for the hybrid memory systems.  ...  In such DRAM cache designs, their effectiveness is affected by the bandwidth and latency of both fast and slow memory.  ...  We implement a memory bandwidth partitioning mechanism similar to the fair queuing memory system (Nesbit et al. 2006 ).  ... 
doi:10.1145/3338505 fatcat:wo5fe2fqivcyragcf37rskzqom

Improving cache performance using read-write partitioning

Samira Khan, Alaa R. Alameldeen, Chris Wilkerson, Onur Mutluy, Daniel A. Jimenezz
2014 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)  
For a single-core system, RWP provides 5% average speedup across the entire SPEC CPU2006 suite, and 14% average speedup for cache-sensitive benchmarks, over the baseline LRU replacement policy.  ...  With few exceptions, cache lines that serve loads are more critical for performance than cache lines that serve only stores.  ...  Acknowledgement We are grateful to Aamer Jaleel who helped with CMP$im [5] .  ... 
doi:10.1109/hpca.2014.6835954 dblp:conf/hpca/KhanAWMJ14 fatcat:rbonj4nuybbjvgejy5jbn5ntsi

Preserving Row Buffer Locality for PCM Wear-Leveling under Massive Parallelism

Xinning Wang, Bin Wang, Zhuo Liu, Weikuan Yu
2015 2015 IEEE 23rd International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems  
It has been exploited to work in concert or alone inside various memory systems to meet the growing bandwidth needs of massive parallelism.  ...  Phase Change Memory (PCM) is a promising alternative technology for DRAM because of its advantages in terms of transistor density and energy consumption.  ...  a PCM-based GPU global memory sub-system.  ... 
doi:10.1109/mascots.2015.39 dblp:conf/mascots/WangWLY15 fatcat:soqwokfeovhodeqqxqbsz4scfu

Research Problems and Opportunities in Memory Systems

2014 Supercomputing Frontiers and Innovations  
systems), 3) providing predictable performance and QoS to applications sharing the memory system (i.e., QoS-aware memory systems).  ...  Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck.  ...  We would like to thank Rachata Ausavarungnirun for logistic help in preparing this article and earlier versions of it.  ... 
doi:10.14529/jsfi140302 fatcat:2zfa7zk3qjgohdsgxmkkqaamuu

Scalable logging through emerging non-volatile memory

Tianzheng Wang, Ryan Johnson
2014 Proceedings of the VLDB Endowment  
Distributed logging-a once prohibitive technique for single node systems in the DRAM era-becomes a promising solution to easing the logging bottleneck because of the nonvolatility and high performance  ...  It potentially invalidates the need for flush-beforecommit as log records are persistent immediately upon write.  ...  However, without NUMA-aware memory allocation, page-level partitioning cannot scale as core count reaches the system limit due to the NUMA effect.  ... 
doi:10.14778/2732951.2732960 fatcat:l4aaimqbbnbtjdnjus2cvtcv2y

Main Memory Scaling: Challenges and Solution Directions [chapter]

Onur Mutlu
2015 More than Moore Technologies for Next Generation Computer Design  
More and increasingly heterogeneous processing cores and agents/clients are sharing the memory system [6, 21, 107, 45, 46, 36, 23] , leading to increasing demand for memory capacity and bandwidth along  ...  predictable performance and QoS to applications sharing the memory system (i.e., QoS-aware memory systems).  ...  Part of the structure of this chapter is based on an evolving set of talks I have delivered at various venues on Scaling the Memory System in the Many-Core Era and Rethinking Memory System Design for Data-Intensive  ... 
doi:10.1007/978-1-4939-2163-8_6 fatcat:okw4kxakuja43kac65zy5c35ye

Reducing the cost of persistence for nonvolatile heaps in end user devices

Sudarsun Kannan, Ada Gavrilovska, Karsten Schwan
2014 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)  
This paper explores the performance implications of using future byte addressable non-volatile memory (NVM) like PCM in end client devices.  ...  With these solutions, experimental evaluations with different end user applications and SPEC2006 benchmarks show up to 12% reductions in cache misses, thereby reducing the total number of NVM writes.  ...  Acknowledgments This research is supported in part by the Intel URO program on software for persistent memories and by NSF award CCF-1161969.  ... 
doi:10.1109/hpca.2014.6835960 dblp:conf/hpca/KannanGS14 fatcat:45daodexbjbhzhhy5zvdoks5cm

A Survey of Non-Volatile Main Memory Technologies: State-of-the-Arts, Practices, and Future Directions [article]

Haikun Liu, Di Chen, Hai Jin, Xiaofei Liao, Bingsheng He, Kan Hu, Yu Zhang
2020 arXiv   pre-print
Non-Volatile Main Memories (NVMMs) have recently emerged as promising technologies for future memory systems.  ...  They bring many research opportunities as well as challenges on system architectural designs, memory management in operating systems (OSes), and programming models for hybrid memory systems.  ...  WADE: Writeback-aware dynamic cache management for NVM-based main memory system.  ... 
arXiv:2010.04406v1 fatcat:jna5pb7lizhvllmhfle4yikife

Banshee

Xiangyao Yu, Christopher J. Hughes, Nadathur Satish, Onur Mutlu, Srinivas Devadas
2017 Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture - MICRO-50 '17  
Second, it reduces unnecessary DRAM cache replacement traffic with a new bandwidth-aware frequency-based replacement policy.  ...  Hence, as we show in this paper, these designs are suboptimal for use with in-package DRAM.  ...  This work is supported in part by the Intel Science and Technology Center (ISTC) for Big Data.  ... 
doi:10.1145/3123939.3124555 dblp:conf/micro/YuHSMD17 fatcat:ttxzmszwynhtnf5rb3xplfct6y

Write-rationing garbage collection for hybrid memories

Shoaib Akram, Jennifer B. Sartor, Kathryn S. McKinley, Lieven Eeckhout
2018 Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation - PLDI 2018  
We implement two such systems. (1) Kingsguard-nursery places new allocation in DRAM and survivors in NVM, reducing NVM writes by 5× versus NVM only with wear-leveling. (2) Kingsguard-writers (KG-W) places  ...  This work opens up new avenues for making hybrid memories practical.  ...  To extend our simulation results to write rates for a 32-core machine, we first obtain write rates for the 4-core system in Table 3 .  ... 
doi:10.1145/3192366.3192392 dblp:conf/pldi/AkramSME18 fatcat:uefp7eapgjbxzf7rmhhtmg7rhu

Banshee: Bandwidth-Efficient DRAM Caching Via Software/Hardware Cooperation [article]

Xiangyao Yu, Christopher J. Hughes, Nadathur Satish, Onur Mutlu, Srinivas Devadas
2017 arXiv   pre-print
Hence, as we show in this paper, these designs are suboptimal for use with in-package DRAM.  ...  The key ideas are to eliminate the in-package DRAM bandwidth overheads due to costly tag accesses through virtual memory mechanism and to incorporate a bandwidth-aware frequency-based replacement policy  ...  These include designs for hybrid DRAM and Phase Change Memory (PCM) [36, 37, 38] and a single DRAM chip with fast and slow portions [39, 40] .  ... 
arXiv:1704.02677v1 fatcat:uljhy5xcnvdrjnfgi5fj74cgwe

The Processing Using Memory Paradigm:In-DRAM Bulk Copy, Initialization, Bitwise AND and OR [article]

Vivek Seshadri, Onur Mutlu
2016 arXiv   pre-print
This approach consumes high latency, bandwidth, and energy for operations that work on a large amount of data.  ...  In existing systems, the off-chip memory interface allows the memory controller to perform only read or write operations.  ...  Acknowledgments We thank the members of the SAFARI and LBA research groups, and the various anonymous reviewers for their valuable feedback on the multiple works described in this document.  ... 
arXiv:1610.09603v1 fatcat:g3snpudrbzdnlh5ixd54blunyq
« Previous Showing results 1 — 15 out of 50 results