Filters








4 Hits in 8.5 sec

Characterizing multi-threaded applications for designing sharing-aware last-level cache replacement policies

Ragavendra Natarajan, Mainak Chaudhuri
2013 2013 IEEE International Symposium on Workload Characterization (IISWC)  
Recent years have seen a large volume of proposals on managing the shared last-level cache (LLC) of chipmultiprocessors (CMPs).  ...  While very few of these studies evaluate the proposed policies on shared memory multi-threaded applications, they do not improve constructive cross-thread sharing of data in the LLC.  ...  The shared LLC behavior of the emerging recognition, mining, synthesis (RMS), and bioinformatics workloads has been evaluated in detail on CMP systems in prior studies [15] , [23] , [24] , [29] .  ... 
doi:10.1109/iiswc.2013.6704665 dblp:conf/iiswc/NatarajanC13 fatcat:3jdrs3tfufczrharcm2ubnljgi

Performance characterization of data mining benchmarks

Vineeth Mekkat, Ragavendra Natarajan, Wei-Chung Hsu, Antonia Zhai
2010 Proceedings of the 2010 Workshop on Interaction between Compilers and Computer Architecture - INTERACT-14  
complex and irregular data structures, such as hash tables; (iii) although data mining applications have large amount of thread-level parallelism, efficient extraction of such parallelism depends on on-chip  ...  The exponential growth of on-chip resources make it critical to exploit parallelism at all granularities for improving the performance of data mining applications.  ...  To compare the parallel performance of multicore processors with shared and private last-level on-chip cache, we also evaluate the parallel execution of these applications on a 8-core x86-based CMP with  ... 
doi:10.1145/1739025.1739040 fatcat:qs7yslt5yjetvp5rsiybexjzve

Gene-Patterns: Should Architecture be Customized for Each Application? [article]

Yuhang Liu, Luming Wang, Xiang Li, Yang Wang, Mingyu Chen, Yungang Bao
2019 arXiv   pre-print
We build the Periodic Table of Memory Access Patterns (PT-MAP), where the indifference curves are analogous to the energy levels in physics, and memory performance optimization is essentially an energy  ...  In this study, we introduce what we refer to as Gene-Patterns, which are the base patterns of diverse applications.  ...  CMP has a deep (three-level) cache hierarchy, where each core is equipped with a split L1 instruction and data cache, and a unified L2 cache, and LLC is shared among all the on-chip cores.  ... 
arXiv:1909.09765v2 fatcat:ksyywbmh2jc6ze74dottwk767u

ARCHITECTURAL SUPPORT FOR EFFICIENT VIRTUAL MEMORY ON BIG-MEMORY SYSTEMS. ABSTRACT OF THE DISSERTATION Architectural Support for Efficient Virtual Memory on Big-Memory Systems

Binh Pham, Binh Pham
unpublished
Looking back, I find that doing a PhD is one of the most challenging yet rewarding experience of my life.  ...  This overhead had been limited to 5-15% of system runtime by using a set of sophisticated hardware solutions , but has increased to 20-50% for many scenarios, including running workloads with large memory  ...  Since the page table walk accesses the last-level cache (LLC) and brings data in 64-byte cache line sizes, seven additional translations are fetched.  ... 
fatcat:r7csmogz6jbvjnkawsobdvb4di