Filters








2,266 Hits in 6.4 sec

A flexible data to L2 cache mapping approach for future multicore processors

Lei Jin, Hyunjin Lee, Sangyeun Cho
2006 Proceedings of the 2006 workshop on Memory system performance and correctness - MSPC '06  
This paper proposes and studies a distributed L2 cache management approach through page-level data to cache slice mapping in a future processor chip comprising many cores.  ...  Unlike previously studied "pure" hardware-based private and shared cache designs, the proposed OS-microarchitecture approach allows mimicking a wide spectrum of L2 caching policies without complex hardware  ...  OS design issues Data mapping through page allocation The OS manages a free list to keep track of available pages in physical memory.  ... 
doi:10.1145/1178597.1178613 dblp:conf/ACMmsp/JinLC06 fatcat:3puyalgmtvao7fzvbm2yzigqzm

Enabling software management for multicore caches with a lightweight hardware support

Jiang Lin, Qingda Lu, Xiaoning Ding, Zhao Zhang, Xiaodong Zhang, P. Sadayappan
2009 Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis - SC '09  
The management of shared caches in multicore processors is a critical and challenging task. Many hardware and OS-based methods have been proposed.  ...  In order to turn cache partitioning methods into reality in the management of multicore processors, we propose to provide an affordable and lightweight hardware support to coordinate with OS-based cache  ...  Another study [6] investigates broad design issues in shared cache management through OS-level page allocation.  ... 
doi:10.1145/1654059.1654074 dblp:conf/sc/LinLDZZS09 fatcat:g5grixuigre55o2krwerkovrui

Achieving Predictable Performance with On-Chip Shared L2 Caches for Manycore-Based Real-Time Systems

Sangyeun Cho, Lei Jin, Kiyeon Lee
2007 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2007)  
This paper focuses on the problem of sharing on-chip caching capacity among multiple programs scheduled together, especially at the L2 cache level.  ...  We observe that both the aspects have to do with where, among many cache slices, a cache block is mapped to, and present an OS-based approach to managing the on-chip L2 cache memory by carefully mapping  ...  Section 3 describes our approach -managing distributed shared L2 caches via two-dimensional page coloring, followed by a quantitative evaluation in Section 4.  ... 
doi:10.1109/rtcsa.2007.16 dblp:conf/rtcsa/ChoJL07 fatcat:hymlichla5fn3czxfjztawbulq

SOS: A Software-Oriented Distributed Shared Cache Management Approach for Chip Multiprocessors

Lei Jin, Sangyeun Cho
2009 2009 18th International Conference on Parallel Architectures and Compilation Techniques  
SOS, our software-oriented distributed shared cache management approach, infers a program's data affinity hints through a novel machine learning based analysis of its L2 cache access behavior.  ...  This paper proposes a new software-oriented approach for managing the distributed shared L2 caches of a chip multiprocessor (CMP) for latency-oriented multithreaded applications.  ...  We showed that by applying the hints to guide page coloring and data replication on the shared L2 cache, it performs significantly better than both the shared cache and the private cache.  ... 
doi:10.1109/pact.2009.14 dblp:conf/IEEEpact/JinC09 fatcat:w6rcuo73vve3tmpj6jvyasckvm

FELI: HW/SW Support for On-Chip Distributed Shared Memory in Multicores [chapter]

Carlos Villavieja, Yoav Etsion, Alex Ramirez, Nacho Navarro
2011 Lecture Notes in Computer Science  
It relies on a set of TLB counters, and dynamical migration of pages from off-chip memory to on-chip memory.  ...  And a 10% in average memory access time even accounting for the cost of page migrations and TLB invalidations.  ...  Memory management in runtime libraries and/or at the Operating System (OS) level are crucial in order to transparently take advantage of these software-managed on-chip memories.  ... 
doi:10.1007/978-3-642-23400-2_27 fatcat:tr4recb4m5g67etayk7bj7gnpa

ULCC

Xiaoning Ding, Kaibo Wang, Xiaodong Zhang
2011 SIGPLAN notices  
We have implemented ULCC at the user level based on a page-coloring technique for last level cache usage management.  ...  Second, at the user level, programmers are not able to allocate cache space at will to running threads in the shared cache, thus data sets with strong locality may not be allocated with sufficient cache  ...  Thus the buddy system managing free physical pages in OS kernel puts the physical pages at the head of the free list, which is the place the OS first tries to allocate physical pages from.  ... 
doi:10.1145/2038037.1941568 fatcat:ymnggpxrbvfqlgg6cjbuxca6qu

ULCC

Xiaoning Ding, Kaibo Wang, Xiaodong Zhang
2011 Proceedings of the 16th ACM symposium on Principles and practice of parallel programming - PPoPP '11  
We have implemented ULCC at the user level based on a page-coloring technique for last level cache usage management.  ...  Second, at the user level, programmers are not able to allocate cache space at will to running threads in the shared cache, thus data sets with strong locality may not be allocated with sufficient cache  ...  Thus the buddy system managing free physical pages in OS kernel puts the physical pages at the head of the free list, which is the place the OS first tries to allocate physical pages from.  ... 
doi:10.1145/1941553.1941568 dblp:conf/ppopp/DingWZ11 fatcat:gy5rtxpilbf2vplswivww2b23e

Deterministic Memory Abstraction and Supporting Multicore System Architecture [article]

Farzad Farshchi, Prathap Kumar Valsan, Renato Mancuso, Heechul Yun
2018 arXiv   pre-print
We present deterministic memory-aware OS and architecture designs, including OS-level page allocator, hardware-level cache, and DRAM controller designs.  ...  We, therefore, propose a new holistic resource management approach driven by a new memory abstraction, which we call Deterministic Memory.  ...  OS-level shared resource management.  ... 
arXiv:1707.05260v4 fatcat:lt76fnjqhfgfbdppvaaqqh6poq

Deterministic Memory Abstraction and Supporting Multicore System Architecture

Farzad Farshchi, Prathap Kumar Valsan, Renato Mancuso, Heechul Yun, Marc Herbstritt
2018 Euromicro Conference on Real-Time Systems  
We present deterministic memory-aware OS and architecture designs, including OS-level page allocator, hardware-level cache, and DRAM controller designs.  ...  We, therefore, propose a new holistic resource management approach driven by a new memory abstraction, which we call Deterministic Memory.  ...  OS-level shared resource management.  ... 
doi:10.4230/lipics.ecrts.2018.1 dblp:conf/ecrts/FarshchiV0Y18 fatcat:xekrdpguhnh3rno5323hto2m3y

Dynamic hardware-assisted software-controlled page placement to manage capacity allocation and sharing within large caches

Manu Awasthi, Kshitij Sudan, Rajeev Balasubramonian, John Carter
2009 2009 IEEE 15th International Symposium on High Performance Computer Architecture  
These mechanisms allow the hardware and OS to dynamically manage cache capacity per thread as well as optimize placement of data shared by multiple threads.  ...  We show an average IPC improvement of 10-20% for multiprogrammed workloads with capacity allocation policies and an average IPC improvement of 8% for multi-threaded workloads with policies for shared page  ...  Migration for Shared Pages The previous sub-section describes a periodic OS routine that allocates cache capacity among cores.  ... 
doi:10.1109/hpca.2009.4798260 dblp:conf/hpca/AwasthiSBC09 fatcat:bxzvguck3jgijf5zvnvzh5qdia

Page Classifier and Placer: A Scheme of Managing Hybrid Caches [chapter]

Xin Yu, Xuanhua Shi, Hai Jin, Xiaofei Liao, Song Wu, Xiaoming Li
2014 Lecture Notes in Computer Science  
Our approach employs a light-weighted hardware profiler to monitor cache behaviors at OS page level and to capture the hot pages.  ...  We propose a new HCA approach that enables OS to be aware of underlying hybrid cache architecture and to control data placement, at OS page level, onto difference cache regions.  ...  In particular, our technique for the first time dynamically manages hybrid cache at page level through page migration and optimize migration policy to amortize the performance overhead.  ... 
doi:10.1007/978-3-662-44917-2_2 fatcat:2kfksbpiszejrjefjbtdss2u4e

Distance-aware round-robin mapping for large NUCA caches

Alberto Ros, Marcelo Cintra, Manuel E. Acacio, Jose M. Garcia
2009 2009 International Conference on High Performance Computing (HiPC)  
However, our policy also introduces an upper bound on the deviation of the distribution of memory pages among cache banks, which lessens the number of off-chip accesses.  ...  In this work, we propose the distance-aware round-robin mapping policy, an OS-managed policy which addresses the trade-off between cache access latency and number of off-chip accesses.  ...  Although cache banks are physically distributed, they constitute a logically shared cache (the L2 cache level in this work).  ... 
doi:10.1109/hipc.2009.5433220 dblp:conf/hipc/RosCAG09 fatcat:scsn6jbwcvesddpspej3bphrda

Rhymes: A shared virtual memory system for non-coherent tiled many-core architectures

King Tin Lam, Jinghao Shi, Dominic Hung, Cho-Li Wang, Zhiquan Lai, Wangbin Zhu, Youliang Yan
2014 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)  
Rhymes features a two-way cache coherence protocol to enforce release consistency for pages allocated in shared physical memory (SPM) and scope consistency for pages in percore private memory.  ...  Experimental results show that our SVM outperforms the pure SPM approach used by Intel's software managed coherence (SMC) library by up to 12 times through improved cache utilization for applications with  ...  MPBT) of the page table entry is set to bypass L2 cache, i.e. the shared data are cacheable in L1 only. Recall that SCC L1 cache can be made write-back or write-through.  ... 
doi:10.1109/padsw.2014.7097807 dblp:conf/icpads/LamSHWLZY14 fatcat:sadkzvqywjepzenwnp32nr65fi

Avoiding cache thrashing due to private data placement in last-level cache for manycore scaling

Jiayuan Meng, Kevin Skadron
2009 2009 IEEE International Conference on Computer Design  
In such cases, a shared, inclusive last level cache (LLC) can improve data sharing and avoid threeway communication for shared reads.  ...  However, when capacity becomes a limitation for the directory or last-level cache, this is not sufficient.  ...  NON-UNIFORM DISTRIBUTION OF PRIVATE DATA We study a two-level memory hierarchy with private L1 caches and an LLC as shared, distributed L2 caches. Hsu et al.  ... 
doi:10.1109/iccd.2009.5413143 dblp:conf/iccd/MengS09 fatcat:fsnkpsdyonevjgjpalpeij6tsa

Virtual hierarchies to support server consolidation

Michael R. Marty, Mark D. Hill
2007 SIGARCH Computer Architecture News  
page sharing among VMs.  ...  We begin with a tiled architecture where each of 64 tiles contains a processor, private L1 caches, and an L2 bank.  ...  An alternative approach minimizes hardware change by relying on the OS or hypervisor to manage the cache hierarchy through page allocation.  ... 
doi:10.1145/1273440.1250670 fatcat:unky5v7mjrgjrkqo6onif4olqm
« Previous Showing results 1 — 15 out of 2,266 results