Filters








11,070 Hits in 7.0 sec

A new memory monitoring scheme for memory-aware scheduling and partitioning

G.E. Suh, S. Devadas, L. Rudolph
Proceedings Eighth International Symposium on High Performance Computer Architecture  
This overall miss-rate can be used to improve scheduling and partitioning schemes.  ...  We propose a low overhead, on-line memory monitoring scheme utilizing a set of novel hardware counters.  ...  Chen, and especially to P. Portante.  ... 
doi:10.1109/hpca.2002.995703 dblp:conf/hpca/SuhDR02 fatcat:mjpy5jpsejfkbchaz6gakyifqy

MN-MATE

Kyu Ho Park, Woomin Hwang, Hyunchul Seok, Chulmin Kim, Dong-jae Shin, Dong Jin Kim, Min Kyu Maeng, Seong Min Kim
2015 ACM Journal on Emerging Technologies in Computing Systems  
Based on the monitored information about the allocated memory, a guest OS co-schedules tasks accessing different types of memory with complementary access intensity.  ...  MN-MATE: Elastic resource management of manycores and a hybrid memory hierarchy for a cloud node. ACM  ...  One is a power-aware memory management scheme in the guest OS for user-level memories.  ... 
doi:10.1145/2701429 fatcat:t7amqrizzbhszookhhodwfghe4

On Cache-Aware Task Partitioning for Multicore Embedded Real-Time Systems

Aaron Lindsay, Binoy Ravindran
2014 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS)  
We present a partitioning scheme called LWFG, which minimizes cache misses by partitioning tasks that share memory onto the same core and by evenly distributing the total working set size across cores.  ...  One approach for real-time scheduling on multicore platforms involves task partitioning, which statically assigns tasks to cores, enabling subsequent corelocal scheduling.  ...  Conclusions We have presented the rationale behind, and the design of, a new cache-aware, real-time task partition- ing scheme, called LWFG.  ... 
doi:10.1109/hpcc.2014.105 dblp:conf/hpcc/LindsayR14 fatcat:74f64soosfg7fjn7sjqmk4uppa

MARACAS: A Real-Time Multicore VCPU Scheduling Framework

Ying Ye, Richard West, Jingyi Zhang, Zhuoqun Cheng
2016 2016 IEEE Real-Time Systems Symposium (RTSS)  
This paper describes a multicore scheduling and load-balancing framework called MARACAS, to address shared cache and memory bus contention.  ...  MARACAS also supports cache-aware scheduling and migration using page recoloring to improve performance isolation amongst VCPUs.  ...  This is followed by several sections that describe the memory-and cache-aware scheduling features that are new to MARACAS, including the algorithms for VCPU load balancing and backgroundmode resource management  ... 
doi:10.1109/rtss.2016.026 dblp:conf/rtss/YeWZC16 fatcat:nafe4dbbzzfrrdi5vdx5bfuw7a

A Coordinated Approach for Practical OS-Level Cache Management in Multi-core Real-Time Systems

Hyoseung Kim, Arvind Kandhalu, Ragunathan Rajkumar
2013 2013 25th Euromicro Conference on Real-Time Systems  
Our scheme provides predictable cache performance, addresses the aforementioned problems of existing software cache partitioning, and efficiently allocates cache partitions to schedule a given taskset.  ...  These are major impediments to the practical adoption of software cache partitioning. In this paper, we propose a practical OS-level cache management scheme for multi-core real-time systems.  ...  Our scheme can be used not only for developing a new system but also for migrating existing applications from single-core to multi-core platforms.  ... 
doi:10.1109/ecrts.2013.19 dblp:conf/ecrts/KimKR13 fatcat:bntx6eu3gjejnleg3hc5ptgd5i

Dynamic Resource Management for Efficient Utilization of Multitasking GPUs

Jason Jong Kyu Park, Yongjun Park, Scott Mahlke
2017 SIGARCH Computer Architecture News  
However, multitasking performance varies heavily depending on the resource partitions within each scheme, and the application mixes.  ...  Furthermore, dynamism within a kernel and interference between the kernels are automatically considered because GPU Maestro finds the best performing partition through direct measurements.  ...  Acknowledgments We would like to thank the anonymous reviewers as well as the fellow members of CCCP research group for their valuable comments and feedbacks.  ... 
doi:10.1145/3093337.3037707 fatcat:7vikinfjtbbmnperrnxretampm

Dynamic Resource Management for Efficient Utilization of Multitasking GPUs

Jason Jong Kyu Park, Yongjun Park, Scott Mahlke
2017 Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '17  
However, multitasking performance varies heavily depending on the resource partitions within each scheme, and the application mixes.  ...  Furthermore, dynamism within a kernel and interference between the kernels are automatically considered because GPU Maestro finds the best performing partition through direct measurements.  ...  Acknowledgments We would like to thank the anonymous reviewers as well as the fellow members of CCCP research group for their valuable comments and feedbacks.  ... 
doi:10.1145/3037697.3037707 dblp:conf/asplos/ParkPM17 fatcat:flmnbk4x3je4loearqmn2uzwkm

Dynamic Resource Management for Efficient Utilization of Multitasking GPUs

Jason Jong Kyu Park, Yongjun Park, Scott Mahlke
2017 ACM SIGOPS Operating Systems Review  
However, multitasking performance varies heavily depending on the resource partitions within each scheme, and the application mixes.  ...  Furthermore, dynamism within a kernel and interference between the kernels are automatically considered because GPU Maestro finds the best performing partition through direct measurements.  ...  Acknowledgments We would like to thank the anonymous reviewers as well as the fellow members of CCCP research group for their valuable comments and feedbacks.  ... 
doi:10.1145/3093315.3037707 fatcat:5xjasiupnrcctp6oarqwrtbukm

Dynamic Resource Management for Efficient Utilization of Multitasking GPUs

Jason Jong Kyu Park, Yongjun Park, Scott Mahlke
2017 SIGPLAN notices  
However, multitasking performance varies heavily depending on the resource partitions within each scheme, and the application mixes.  ...  Furthermore, dynamism within a kernel and interference between the kernels are automatically considered because GPU Maestro finds the best performing partition through direct measurements.  ...  Acknowledgments We would like to thank the anonymous reviewers as well as the fellow members of CCCP research group for their valuable comments and feedbacks.  ... 
doi:10.1145/3093336.3037707 fatcat:xkqmepvakbh23gfyrnkjot6uge

Survey of Memory Management Techniques for HPC and Cloud Computing

Anna Pupykina, Giovanni Agosta
2019 IEEE Access  
However, for this scenario to succeed in practice, resources, including memory, need to be allocated with a vision that includes both the application requirements and the current and future state of the  ...  The emergence of new classes of HPC applications and usage models, such as real-time HPC and cloud HPC, coupled with the increasingly heterogeneous nature of HPC architectures, requires a renewed investigation  ...  The polymorphic memory management proposed in [72] performs a periodic procedure of hardware-assisted page monitoring, followed by OS intervention for migration and partitioning.  ... 
doi:10.1109/access.2019.2954169 fatcat:hwtpltrdrffqdjdofhr3shjkla

Contention Aware Scheduler with Accurate Memory Bandwidth Measurement for Predictable Multicore Software

Mogilicharla Surender, Boinapalli Venkanna
2017 International Journal of Advanced Research in Computer Science and Software Engineering  
We propose a contention aware scheduler for predictable multi core software to reduce contention and to improve the performance of the system.  ...  Keywords-Memory Bandwidth, Contention aware scheduler, performance counters I.  ...  Any contention aware scheduler must consist of two parts: a classification scheme for identifying which applications should and should not be scheduled together as well as the scheduling policy that assigns  ... 
doi:10.23956/ijarcsse/v7i6/0175 fatcat:iweddfl3czc6lo45as6swcxudm

A bandwidth-aware memory-subsystem resource management using non-invasive resource profilers for large CMP systems

Dimitris Kaseridis, Jeffrey Stuecheli, Jian Chen, Lizy K John
2010 HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture  
We propose a dynamic memory-subsystem resource management scheme that considers both cache capacity and memory bandwidth contention in large multi-chip CMP systems.  ...  While previously proposed schemes focus on resource sharing within a chip, we explore additional possibilities both inside and outside a single chip.  ...  ACKNOWLEDGEMENTS The authors would like to thank the anonymous reviewers for their suggestions that helped improve the quality of this paper.  ... 
doi:10.1109/hpca.2010.5416654 dblp:conf/hpca/KaseridisSCJ10 fatcat:pwl37tzy7bb3lgo3bvyfm7j35i

Cache Friendliness-Aware Managementof Shared Last-Level Caches for HighPerformance Multi-Core Systems

Dimitris Kaseridis, Muhammad Faisal Iqbal, Lizy Kurian John
2014 IEEE transactions on computers  
In this work we describe a quasi-partitioning scheme for last-level caches that combines the memory-level parallelism, cache friendliness and interference sensitivity of competing applications, to efficiently  ...  The proposed scheme improves both system throughput and execution fairnessoutperforming previous schemes that are oblivious to applications' memory behavior.  ...  In this work we describe a Memory-level parallelism and Cache-Friendliness aware Quasi-partitioning scheme (MCFQ).  ... 
doi:10.1109/tc.2013.18 fatcat:lyqd5vduzzcdln6k7mfzbbi3yi

NUMA obliviousness through memory mapping

Mrunal Gawade, Martin Kersten
2015 Proceedings of the 11th International Workshop on Data Management on New Hardware - DaMoN'15  
In a shared memory setting the multi-socket CPUs are equipped with their own memory module, and access memory modules across sockets in a non-uniform access pattern (NUMA).  ...  Hence, setting explicit process and memory affinity results into a robust execution in NUMA oblivious plans.  ...  In this scheme a single MonetDB instance uses horizontally partitioned data (lineitem and orders tables) across four sockets.  ... 
doi:10.1145/2771937.2771948 dblp:conf/damon/GawadeK15 fatcat:zkx62rikc5f7pfaz6fnsygvnii

A Staged Memory Resource Management Method for CMP systems

Yangguo Liu, Junlin Lu, Dong Tong, Xu Cheng
2017 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP)  
We further integrate DPBT with the improved memory channel partitioning scheme and a memory scheduling policy to leverage the architecture advantages, and present a Stage Memory Resource Management Method  ...  DPBT achieves a more balance memory bandwidth partitioning. Moreover, we improve the previous memory channel partitioning scheme by integrating it with a bank partitioning.  ...  (DPBT), a new application-aware bandwidth-throttling approach to better balance the memory bandwidth among co-scheduled applications.  ... 
doi:10.1109/asap.2017.7995264 dblp:conf/asap/LiuLTC17 fatcat:26o65pssdnd3lmdzalfc34eiwy
« Previous Showing results 1 — 15 out of 11,070 results