Filters








12,533 Hits in 5.8 sec

Locality-aware data replication in the Last-Level Cache

George Kurian, Srinivas Devadas, Omer Khan
2014 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)  
We propose a locality-aware selective data replication protocol for the last-level cache (LLC).  ...  Our approach relies on low overhead yet highly accurate in-hardware runtime classification of data locality at the cache line granularity, and only allows replication for cache lines with high reuse.  ...  Conclusion We have proposed an intelligent locality-aware data replication scheme for the last-level cache.  ... 
doi:10.1109/hpca.2014.6835921 dblp:conf/hpca/KurianDK14 fatcat:gsjpmhtpvzbpdjv6fdmvwfqt6m

A Two-Level Cache Aware Adaptive Data Replication Mechanism for Shared LLC

Qianqian WU, Zhenzhou JI
2022 IEICE transactions on information and systems  
Khan, “Locality-aware data repli- LADR also adds a new structure (complete locality classi- cation in the last-level cache,” International Symposium on High- fier) to store locality information  ...  In fact, in order to realize locality-aware data replication, [4] G. Kurian, S. Devadas, and O.  ... 
doi:10.1587/transinf.2022edl8002 fatcat:dm5wa5qs6jbcvdhocoqz6t5k74

On Improving Efficiency and Utilization of Last Level Cache in Multicore Systems

Yumna Zahid, Hina Khurshid, Zulfiqar Ali Memon
2018 Information Technology and Control  
The current on-chip architecture comprises multiple cores which usually share last level cache which can be physically distributed on chip.  ...  This article aims to provide the researchers with the state-of-the-art critical review of the various approaches that focus on data replication and cache partitioning techniques for L3 cache.  ...  This means if a miss occurs in the L1 cache and as a result, the last level cache will be searched for requesting data; data are invalidated on the local cache of a core.  ... 
doi:10.5755/j01.itc.47.3.18433 fatcat:pgrmyliv3ra5vjlkqqv3vhuudu

The locality-aware adaptive cache coherence protocol

George Kurian, Omer Khan, Srinivas Devadas
2013 SIGARCH Computer Architecture News  
On a set of parallel benchmarks, our lowoverhead locality-aware mechanisms reduce the overall energy by 25% and completion time by 15% in an NoC-based multicore with the Reactive-NUCA on-chip cache organization  ...  Therefore, harnessing data locality is of fundamental importance in future processors.  ...  A request for data allocates and replicates a data block in the private cache hierarchy even if the data has no spatial or temporal locality.  ... 
doi:10.1145/2508148.2485967 fatcat:6dmu2mv4hjgajeq5lqrgjlbhri

NoC-aware cache design for chip multiprocessors

Ahmed K. Abousamra, Rami G. Melhem, Alex K. Jones
2010 Proceedings of the 19th international conference on Parallel architectures and compilation techniques - PACT '10  
In this work we present a NoC-aware cache design that focuses on communication locality; a property both the cache and NoC affect and can exploit.  ...  The performance of chip multiprocessors (CMPs) is dependent on the data access latency, which is highly dependent on the design of the on-chip interconnect (NoC) and the organization of the memory caches  ...  However, we do not study replication in this work. Similarly, data replication is not used in the last level cache of the state-of-the-art NoC-Cache co-designed system [7] which we compare with.  ... 
doi:10.1145/1854273.1854354 dblp:conf/IEEEpact/AbousamraMJ10 fatcat:3gyhz6u2efajfoaw4dsdr6vd6y

The locality-aware adaptive cache coherence protocol

George Kurian, Omer Khan, Srinivas Devadas
2013 Proceedings of the 40th Annual International Symposium on Computer Architecture - ISCA '13  
On a set of parallel benchmarks, our lowoverhead locality-aware mechanisms reduce the overall energy by 25% and completion time by 15% in an NoC-based multicore with the Reactive-NUCA on-chip cache organization  ...  Therefore, harnessing data locality is of fundamental importance in future processors.  ...  A request for data allocates and replicates a data block in the private cache hierarchy even if the data has no spatial or temporal locality.  ... 
doi:10.1145/2485922.2485967 dblp:conf/isca/KurianKD13 fatcat:vzu2y5zcyrgk5i3ovxvtworzly

A Reuse-Degree based Locality Classifier for Locality-Aware Data Replication

Qianqian Wu, Zhenzhou Ji
2019 IEEE Access  
INDEX TERMS Chip multiprocessors (CMPs), last level cache (LLC), data replication, locality classifier, reuse-degree (RD).  ...  The state-of-the-art Locality-Aware Data Replication (LADR) scheme provides an effective tradeoff between capacity and latency through an in-hardware structure named locality classifier.  ...  The Locality-Aware Data Replication (LADR) [7] policy controls data replication according to data locality.  ... 
doi:10.1109/access.2019.2959840 fatcat:i4youx5gozeijfox7d2w6xyraa

Contents Management in First-Level Multibanked Data Caches [chapter]

E. F. Torres, P. Ibañez, V. Viñals, J. M. Llabería
2004 Lecture Notes in Computer Science  
In this paper we introduce replication degree and data distribution as the main multibanking design axes.  ...  High-performance processors will increasingly rely on multibanked first-level caches to meet frequency requirements.  ...  The 8-entry coalescing local write buffers update the L1 cache banks in unused cycles.  ... 
doi:10.1007/978-3-540-27866-5_68 fatcat:odx76zr5ezeqle3oxb23t5pgke

OctopusFS in action

Elena Kakoulli, Nikolaos D. Karmiris, Herodotos Herodotou
2018 Proceedings of the VLDB Endowment  
It also exposes the network locations and storage tiers of the data in order to allow higher-level systems to make locality-aware and tier-aware decisions.  ...  OctopusFS contains automated data-driven policies for managing the placement and retrieval of data across the nodes and storage tiers of the cluster.  ...  New data-processing systems are utilizing memory or SSDs for primary storage [9] or for actively caching data [7] , while others are using local disks for caching data from remote or cloud storage  ... 
doi:10.14778/3229863.3236223 fatcat:ioql5kdbujea7icgruj5v6dcoi

Low-Latency Mechanisms for Near-Threshold Operation of Private Caches in Shared Memory Multicores

Farrukh Hijaz, Qingchuan Shi, Omer Khan
2012 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture Workshops  
Although the last-level shared caches (LLC) in modern multicores are protected using error correcting codes (ECC), the private caches have been left unprotected due to their performance sensitivity to  ...  of each cache line that is classified as private data, and only allow such data to be cached in the private caches if it shows high spatiotemporal locality.  ...  A request allocates and replicates a data block in the private caches even if the data has no spatial or temporal locality.  ... 
doi:10.1109/microw.2012.10 dblp:conf/micro/HijazSK12 fatcat:if5fzcgryrfv7g4tvc53jeciqi

Analysis Study on Caching and Replica Placement Algorithm for Content Distribution in Distributed Computing Networks

Anna Saro Vijendran
2012 International Journal of Peer to Peer Networks  
And also suggest a new popularity based QoS-aware(Quality of Service) smart replica placement algorithm for content distribution in peer-to-peer overlay networks which overcomes the access latency, fault  ...  Recently there has been significant research focus on distributed computing network massively caching and replica placement problems for content distribution in globally.  ...  Data replication and caching techniques are the important two services in distributed computing networks.  ... 
doi:10.5121/ijp2p.2012.3602 fatcat:wawelnixenacvgnrxygp2wfnsm

A Dynamic Pressure-Aware Associative Placement Strategy for Large Scale Chip Multiprocessors

Mohammad Hammoud, Sangyeun Cho, Rami Melhem
2010 IEEE computer architecture letters  
Temporal pressure at the on-chip last-level cache, is continuously collected at a group (comprised of cache sets) granularity, and periodically recorded at the memory controller to guide the placement  ...  This paper describes Cache Equalizer (CE), a novel distributed cache management scheme for large scale chip multiprocessors (CMPs). Our work is motivated by large asymmetry in cache sets usages.  ...  CE embarks upon such a key factor and suggests mapping cache blocks to the on-chip last level cache based on temporal pressures.  ... 
doi:10.1109/l-ca.2010.7 fatcat:5obf374lfnbnzhyuy2qr2r4r2i

A Multicore-Aware Runtime Architecture for Scalable Service Composition

Daniele Bonetta, Achille Peternier, Cesare Pautasso, Walter Binder
2010 2010 IEEE Asia-Pacific Services Computing Conference  
multicore-awareness in the design of scalable process execution engines.  ...  However, the advent of modern multicore machines, comprising several chip multiprocessors each offering multiple cores and often featuring a large shared cache, offers the opportunity to redesign the architecture  ...  ACKNOWLEDGMENT This work is funded by the Swiss National Science Foundation with the SOSOA project (SINERGIA grant nr. CRSI22 127386), and by the European Community under the grant agreement no.  ... 
doi:10.1109/apscc.2010.61 dblp:conf/apscc/BonettaPPB10 fatcat:b7cv3fatcvc5jg55hj2wtaas3a

C-Aware: A Cache Management Algorithm Considering Cache Media Access Characteristic in Cloud Computing

Zhu Xudong, Yin Yang, Liu Zhenjun, Shao Fang
2013 Mathematical Problems in Engineering  
Our benchmark results in real system show that, in the scenario where the size of local cache is half of data set, C-Aware gets nearly 80% improvement compared with traditional methods when the server  ...  Using local disk of computing nodes as a cache can sometimes get better performance than accessing data through the network.  ...  Acknowledgments This work is supported in part by Natural Science Foundation of China "Research on the snapshot data security storage technology for authorization of release, " no. 61100057, and the National  ... 
doi:10.1155/2013/867167 fatcat:y4frimxngfhs7kdrcehv5adsxu

Multicore-aware reuse distance analysis

Derek L Schuff, Benjamin S Parsons, Vijay S Pai
2010 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)  
The results show that adding multicoreawareness substantially improves the ability of reuse distance analysis to model cache behavior, reducing the error in miss ratio prediction (relative to cache simulation  ...  Existing reuse distance analysis methods track the number of distinct addresses referenced between reuses of the same address by a given thread, but do not model the effects of data references by other  ...  Shi et al. present an analytical model of data replication and a method to simulate multiple caches in a CMP in a single pass [24] .  ... 
doi:10.1109/ipdpsw.2010.5470780 dblp:conf/ipps/SchuffPP10 fatcat:hnzomp6i25hnjdxl7cysjglquy
« Previous Showing results 1 — 15 out of 12,533 results