Filters








798 Hits in 6.3 sec

Improving inclusive cache performance with two-level eviction priority

Lingda Li, Dong Tong, Zichao Xie, Junlin Lu, Xu Cheng
2012 2012 IEEE 30th International Conference on Computer Design (ICCD)  
Unfortunately, they are inapplicable in inclusive LLCs. In this paper, we propose Two-level Eviction Priority (TEP) policy.  ...  Many recent intelligent management policies are proposed to improve the last-level cache (LLC) performance by evicting blocks with poor locality earlier.  ...  We propose Two-level Eviction Priority (TEP) policy for the inclusive cache replacement to achieve the two goals above.  ... 
doi:10.1109/iccd.2012.6378668 dblp:conf/iccd/LiTXLC12 fatcat:og6gjd7vyzf5zndpyew6udeole

On the theory and potential of LRU-MRU collaborative cache management

Xiaoming Gu, Chen Ding
2011 Proceedings of the international symposium on Memory management - ISMM '11  
the optimal improvement.  ...  In theory, it can obtain optimal cache performance. In this paper, we study a collaborative caching system that allows a program to choose different caching methods for its data.  ...  Dalessandro for help with LLVM. We also wish to thank Tongxin Bai, Bin Bao, Arrvindh Shriraman, Xiaoya Xiang, and the anonymous reviewers for their comments and/or proofreading.  ... 
doi:10.1145/1993478.1993485 dblp:conf/iwmm/GuD11 fatcat:esn5ago6gzctzmkmrdzw5cgtre

On the theory and potential of LRU-MRU collaborative cache management

Xiaoming Gu, Chen Ding
2011 SIGPLAN notices  
the optimal improvement.  ...  In theory, it can obtain optimal cache performance. In this paper, we study a collaborative caching system that allows a program to choose different caching methods for its data.  ...  Dalessandro for help with LLVM. We also wish to thank Tongxin Bai, Bin Bao, Arrvindh Shriraman, Xiaoya Xiang, and the anonymous reviewers for their comments and/or proofreading.  ... 
doi:10.1145/2076022.1993485 fatcat:vr3pzqvrnfhcbaphds7vcgtzzu

Mitigating Prefetcher-Caused Pollution Using Informed Caching Policies for Prefetched Blocks

Vivek Seshadri, Samihan Yedkar, Hongyi Xin, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry
2015 ACM Transactions on Architecture and Code Optimization (TACO)  
Only predicted-accurate prefetches are inserted into the cache with a high priority.  ...  Many modern high-performance processors prefetch blocks into the on-chip cache. Prefetched blocks can potentially pollute the cache by evicting more useful blocks.  ...  For example, in a system with an inclusive cache hierarchy, evicting a block from the LLC requires the block to be evicted from all previous levels of the cache.  ... 
doi:10.1145/2677956 fatcat:si4li6c7zzhkfoquoohx25dpri

A generalized theory of collaborative caching

Xiaoming Gu, Chen Ding
2013 SIGPLAN notices  
The complete effect includes four cases at a cache hit and two  ...  We show two theoretical results for the general hint. The first is a new cache replacement policy, priority LRU, which permits the complete range of choices between MRU and LRU.  ...  Acknowledgments The idea of priority hints was suggested by Kathryn McKinley, which was the starting point of this entire work.  ... 
doi:10.1145/2426642.2259012 fatcat:fv2aq7rtczdwdnjacqhcz2vjkq

A generalized theory of collaborative caching

Xiaoming Gu, Chen Ding
2012 Proceedings of the 2012 international symposium on Memory Management - ISMM '12  
The complete effect includes four cases at a cache hit and two  ...  We show two theoretical results for the general hint. The first is a new cache replacement policy, priority LRU, which permits the complete range of choices between MRU and LRU.  ...  Acknowledgments The idea of priority hints was suggested by Kathryn McKinley, which was the starting point of this entire work.  ... 
doi:10.1145/2258996.2259012 dblp:conf/iwmm/GuD12 fatcat:nywf2vbsjjf6vm74vv5tbeye4q

Relative Performance of a Multi-level Cache with Last-Level Cache Replacement: An Analytic Review [article]

Bijay Paikaray
2013 arXiv   pre-print
Current day processors employ multi-level cache hierarchy with one or two levels of private caches and a shared last-level cache (LLC).  ...  Cache replacement techniques for inclusive LLCs may not be efficient for multilevel cache as it can be shared by enormous applications with varying access behavior, running simultaneously.  ...  Thus, many researches have been done to improve the cache performance, although usually focusing on a given level of the cache hierarchy (i.e., L1, L2 or L3).  ... 
arXiv:1307.6406v1 fatcat:xswji4f5pzbnfdfg24gygqzoqu

The evicted-address filter

Vivek Seshadri, Onur Mutlu, Michael A. Kozuch, Todd C. Mowry
2012 Proceedings of the 21st international conference on Parallel architectures and compilation techniques - PACT '12  
with high reuse evicting each other from the cache.  ...  In a system with limited cache space, we would ideally like to prevent 1) cache pollution, i.e., blocks with low reuse evicting blocks with high reuse from the cache, and 2) cache thrashing, i.e., blocks  ...  ., [17, 19, 34, 35] ), two problems degrade cache performance signi cantly. First, cache blocks with little or no reuse can evict blocks with high reuse from the cache.  ... 
doi:10.1145/2370816.2370868 dblp:conf/IEEEpact/SeshadriMKM12 fatcat:byttqcggj5hotmdg5vyujcpjli

Dynamic and discrete cache insertion policies for managing shared last level caches in large multicores

Aswinkumar Sridharan, André Seznec
2017 Journal of Parallel and Distributed Computing  
Least Priority: Applications with Footprint-number range (>= 16) are assigned least priority. Only one out of thirty-two accesses are installed at the last level cache with least priority 3.  ...  These mechanisms achieve fine-grained (at cache block level) through adjusting the eviction priorities.  ... 
doi:10.1016/j.jpdc.2017.02.004 fatcat:24hmq6ycqnhhxcmyt2hzaxrmpu

P-OPT: Program-Directed Optimal Cache Management [chapter]

Xiaoming Gu, Tongxin Bai, Yaoqing Gao, Chengliang Zhang, Roch Archambault, Chen Ding
2008 Lecture Notes in Computer Science  
The paper addresses the communication problem with two new optimal algorithms for Program-directed OPTimal cache management (P-OPT), in which a program designates certain accesses as bypasses and trespasses  ...  As the amount of on-chip cache increases as a result of Moore's law, cache utilization is increasingly important as the number of processor cores multiply and the contention for memory bandwidth becomes  ...  The result is portable, in the sense that the performance does not degrade if an implementation optimized for one cache size is used on a machine with a larger cache.  ... 
doi:10.1007/978-3-540-89740-8_15 fatcat:zxnesy3xzvczdmegyknx35mbbm

Scavenger: A New Last Level Cache Architecture with Global Block Priority

Arkaprava Basu, Nevin Kirman, Meyrem Kirman, Mainak Chaudhuri, Jose Martinez
2007 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007)  
When compared against a baseline configuration with a 1MB 8-way L2 cache, a Scavenger configuration with a 512kB 8-way conventional cache and a 512kB victim file achieves an IPC improvement of up to 63%  ...  However, we observe that the number of intervening misses at the last-level cache between the eviction of a particular block and its reuse can be very large, preventing traditional victim caching mechanisms  ...  We thank Andreas Moshovos and the anonymous reviewers for suggestions to improve the paper.  ... 
doi:10.1109/micro.2007.42 dblp:conf/micro/BasuKKCM07 fatcat:4wuxz6mqcjh4hhywwemfy2x26i

Scavenger: A New Last Level Cache Architecture with Global Block Priority

Arkaprava Basu, Nevin Kirman, Meyrem Kirman, Mainak Chaudhuri, Jose Martinez
2007 Microarchitecture (MICRO), Proceedings of the Annual International Symposium on  
When compared against a baseline configuration with a 1MB 8-way L2 cache, a Scavenger configuration with a 512kB 8-way conventional cache and a 512kB victim file achieves an IPC improvement of up to 63%  ...  However, we observe that the number of intervening misses at the last-level cache between the eviction of a particular block and its reuse can be very large, preventing traditional victim caching mechanisms  ...  We thank Andreas Moshovos and the anonymous reviewers for suggestions to improve the paper.  ... 
doi:10.1109/micro.2007.4408273 fatcat:cgcf5aeekff7jcg5drqwpawvvi

A Survey of Cache Bypassing Techniques

Sparsh Mittal
2016 Journal of Low Power Electronics and Applications  
With increasing core-count, the cache demand of modern processors has also increased.  ...  This paper presents a survey of cache bypassing techniques for CPUs, GPUs and CPU-GPU heterogeneous systems, and for caches designed with SRAM, non-volatile memory (NVM) and die-stacked DRAM.  ...  ., 32) priority levels to memory requests, such that those with higher number of effective addresses get higher priority.  ... 
doi:10.3390/jlpea6020005 fatcat:rkiqtcjbcvggde5utaqogg5xxa

ARW: Efficient Replacement Policies for Phase Change Memory and NAND Flash

Xi ZHANG, Xinning DUAN, Jincui YANG, Jingyuan WANG
2017 IEICE transactions on information and systems  
When used on both LLC and DRAM cache, ARW policies achieve an impressive reduction of 40% in write traffic without system performance degradation.  ...  ARW can reduce the write traffic to NVM by preventing dirty data blocks from frequent evictions. We evaluate ARW policies on systems with PCM as main memory and NAND Flash as disk.  ...  The reason is that with DRAM buffer, the requests are filtered by two cache levels, which leads to increase in the proportion of single-use lines in DRAM cache.  ... 
doi:10.1587/transinf.2016edp7205 fatcat:md4o6xmyoffshlvsshsklojkxu

A fault-tolerant last level cache for CMPs operating at ultra-low voltage

Alexandra Ferrerón, Jesús Alastruey-Benedé, Darío Suárez Gracia, Teresa Monreal Arnal, Pablo Ibáñez Marín, Víctor Viñals Yúfera
2019 Journal of Parallel and Distributed Computing  
Overview of the system 14 Our baseline system consists of a tiled CMP, with an inclusive 15 two-level cache hierarchy, where the second level cache or LLC 16 is shared and distributed among the processor  ...  words, a performance improvement with respect to BD 25 of 2%, 2.2%, 2.7%, 3.6%, and 13.1%.  ... 
doi:10.1016/j.jpdc.2018.10.010 fatcat:36qtjapm3faw3k7dca242clk3q
« Previous Showing results 1 — 15 out of 798 results