A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is
Recent years have seen a large volume of proposals on managing the shared last-level cache (LLC) of chipmultiprocessors (CMPs). ... While very few of these studies evaluate the proposed policies on shared memory multi-threaded applications, they do not improve constructive cross-thread sharing of data in the LLC. ... The shared LLC behavior of the emerging recognition, mining, synthesis (RMS), and bioinformatics workloads has been evaluated in detail on CMP systems in prior studies  ,  ,  ,  . ...doi:10.1109/iiswc.2013.6704665 dblp:conf/iiswc/NatarajanC13 fatcat:3jdrs3tfufczrharcm2ubnljgi
Proceedings of the 2010 Workshop on Interaction between Compilers and Computer Architecture - INTERACT-14
complex and irregular data structures, such as hash tables; (iii) although data mining applications have large amount of thread-level parallelism, efficient extraction of such parallelism depends on on-chip ... The exponential growth of on-chip resources make it critical to exploit parallelism at all granularities for improving the performance of data mining applications. ... To compare the parallel performance of multicore processors with shared and private last-level on-chip cache, we also evaluate the parallel execution of these applications on a 8-core x86-based CMP with ...doi:10.1145/1739025.1739040 fatcat:qs7yslt5yjetvp5rsiybexjzve
We build the Periodic Table of Memory Access Patterns (PT-MAP), where the indifference curves are analogous to the energy levels in physics, and memory performance optimization is essentially an energy ... In this study, we introduce what we refer to as Gene-Patterns, which are the base patterns of diverse applications. ... CMP has a deep (three-level) cache hierarchy, where each core is equipped with a split L1 instruction and data cache, and a unified L2 cache, and LLC is shared among all the on-chip cores. ...arXiv:1909.09765v2 fatcat:ksyywbmh2jc6ze74dottwk767u
Looking back, I find that doing a PhD is one of the most challenging yet rewarding experience of my life. ... This overhead had been limited to 5-15% of system runtime by using a set of sophisticated hardware solutions , but has increased to 20-50% for many scenarios, including running workloads with large memory ... Since the page table walk accesses the last-level cache (LLC) and brings data in 64-byte cache line sizes, seven additional translations are fetched. ...fatcat:r7csmogz6jbvjnkawsobdvb4di