Filters








303 Hits in 7.4 sec

Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems

Onur Mutlu, Thomas Moscibroda
2008 2008 International Symposium on Computer Architecture  
Our parallelism-aware batch scheduler (PAR-BS) design is based on two key ideas. First, PAR-BS processes DRAM requests in batches to provide fairness and to avoid starvation of requests.  ...  As a result both fairness and system throughput degrade, and some threads can starve for long time periods.  ...  drafts of this paper.  ... 
doi:10.1109/isca.2008.7 dblp:conf/isca/MutluM08 fatcat:je6byo2lqrh5vcwg7ytuqobwxe

Parallelism-Aware Batch Scheduling

Onur Mutlu, Thomas Moscibroda
2008 SIGARCH Computer Architecture News  
Our parallelism-aware batch scheduler (PAR-BS) design is based on two key ideas. First, PAR-BS processes DRAM requests in batches to provide fairness and to avoid starvation of requests.  ...  As a result both fairness and system throughput degrade, and some threads can starve for long time periods.  ...  drafts of this paper.  ... 
doi:10.1145/1394608.1382128 fatcat:yfmte3tzcvdivb4blz2itqfgt4

CADS: Core-Aware Dynamic Scheduler for Multicore Memory Controllers [article]

Eduardo Olmedo Sanchez, Xian-He Sun
2019 arXiv   pre-print
Our scheduler utilizes locality among data requests from multiple cores and exploits parallelism in accessing multiple banks of DRAM.  ...  CADS is also able to share the DRAM while guaranteeing fairness to all cores accessing memory.  ...  Our CADS based scheduling policy is much more sophisticated to include core awareness and adaptively change fairness and bank parallelism much better.  ... 
arXiv:1907.07776v1 fatcat:vvrjpymdxvb7dcrdmzveu75hoe

Fine-grained QoS scheduling for PCM-based main memory systems

Ping Zhou, Yu Du, Youtao Zhang, Jun Yang
2010 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)  
With wide adoption of chip multiprocessors (CMPs) in modern computers, there is an increasing demand for large capacity main memory systems.  ...  When scheduling a mix of applications of different priority levels, it is often important to provide tunable QoS (Quality-of-Service) for the applications with high priority.  ...  ACKNOWLEDGMENT This work is supported in part by NSF under CNS-0720595, CCF-0734339, CAREER awards CNS-0747242 and CCF-0641177, and a gift from Intel Corp.  ... 
doi:10.1109/ipdps.2010.5470451 dblp:conf/ipps/ZhouDZY10 fatcat:vxumxnwkwzcaflykbnirc3clvm

Parallel application memory scheduling

Eiman Ebrahimi, Rustam Miftakhutdinov, Chris Fallin, Chang Joo Lee, José A. Joao, Onur Mutlu, Yale N. Patt
2011 Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture - MICRO-44 '11  
This inter-thread memory system interference can significantly degrade parallel application performance. Better memory request scheduling may mitigate such performance degradation.  ...  A primary use of chip-multiprocessor (CMP) systems is to speed up a single application by exploiting thread-level parallelism.  ...  This design reduces interference between requests coming from different IP blocks of the SoC working on the same application by applying the parallelism-aware batch scheduling mechanism of Mutlu and Moscibroda  ... 
doi:10.1145/2155620.2155663 dblp:conf/micro/EbrahimiMFLJMP11 fatcat:4c5jkiu7jfayhi67ezijzojfym

Topology-aware GPU scheduling for learning workloads in cloud environments

Marcelo Amaral, Jordà Polo, David Carrera, Seetharami Seelam, Malgorzata Steinder
2017 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17  
This paper presents a new topology-aware workload placement strategy to schedule deep learning jobs on multi-GPU systems.  ...  Workload schedulers must consider hardware topology and workload communication requirements in order to allocate CPU and GPU resources for optimal execution time and improved utilization in shared cloud  ...  Furthermore, both cloud and HPC systems can bene t from a GPU topology-aware schedule.  ... 
doi:10.1145/3126908.3126933 dblp:conf/sc/AmaralPCSS17 fatcat:vu4i6hn7jbbtjou3f64bkbg4xa

Contention-Aware Scheduling on Multicore Systems

Sergey Blagodurov, Sergey Zhuravlev, Alexandra Fedorova
2010 ACM Transactions on Computer Systems  
We also conclude that the highest impact of contention-aware scheduling techniques is not in improving performance of a workload as a whole but in improving quality of service or performance isolation  ...  To show the applicability of our analysis we design a new scheduling algorithm, which we prototype at user level, and demonstrate that it performs within 2% of the optimal.  ...  The solutions proposed offer enhancements to certain parts of the memory hierarchy rather than target the shared resource contention problem in the whole system via scheduling.  ... 
doi:10.1145/1880018.1880019 fatcat:eo3ush725jh2tme2zr7i7jpixq

GPU Virtualization and Scheduling Methods

Cheol-Ho Hong, Ivor Spence, Dimitrios S. Nikolopoulos
2017 ACM Computing Surveys  
Furthermore, we review GPU scheduling methods that address performance and fairness issues between multiple virtual machines sharing GPUs.  ...  In this survey paper, we present an extensive and in-depth survey of GPU virtualization techniques and their scheduling methods.  ...  ACKNOWLEDGMENTS The authors are grateful to the anonymous reviewers for their valuable comments and suggestions.  ... 
doi:10.1145/3068281 fatcat:bng347au6veltazpmyyzv5ijmu

Mapping and scheduling HPC applications for optimizing I/O

Jesus Carretero, Emmanuel Jeannot, Guillaume Pallez, David E. Singh, Nicolas Vidal
2020 Proceedings of the 34th ACM International Conference on Supercomputing  
Some of the experiments presented in this paper were carried out using the PlaFRIM experimental testbed, supported by Inria, CNRS (LABRI and IMB), Université de Bordeaux, Bordeaux INP and Conseil Régional  ...  ACKNOWLEDGEMENTS This work was supported in part by the French National Research Agency (ANR) in the frame of DASH (ANR-17-CE25-0004).  ...  In this case our bandwidth-aware strategy performs better for both metrics. In the future, several directions open up, both for the mapping of applications and the scheduling of I/O.  ... 
doi:10.1145/3392717.3392764 fatcat:qw37li63ffgtzl3zfkhwp5juwe

High Performance Memory Requests Scheduling Technique for Multicore Processors

Walid El-Reedy, Ali A. El-Moursy, Hossam A.H. Fahmy
2012 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems  
Achieving both high-throughput and fairness simultaneously is challenging.  ...  In modern computer systems, long memory latency is one of the main bottlenecks micro-architects are facing for leveraging the system performance especially for memory-intensive applications.  ...  This algorithm is based on two main ideas which are batch scheduling and parallelism-aware scheduling.  ... 
doi:10.1109/hpcc.2012.26 dblp:conf/hpcc/El-ReedyEF12 fatcat:vwq3ussjyvfzhccb3fby7db6im

Prefetch-aware shared resource management for multi-core systems

Eiman Ebrahimi, Chang Joo Lee, Onur Mutlu, Yale N. Patt
2011 SIGARCH Computer Architecture News  
We show that our mechanisms improve the performance of a 4-core system that uses network fair queuing, parallelism-aware batch scheduling, and fairness via source throttling by 11.0%, 10.9%, and 11.3%  ...  Recent proposals have addressed high-performance and fair management of these shared resources; however, none of them take into account prefetch requests.  ...  We gratefully acknowledge the support of the Cockrell Foundation, Intel, AMD, Gigascale Systems Research Center, and CyLab. This research was partially supported by NSF CA-REER Award CCF-0953246.  ... 
doi:10.1145/2024723.2000081 fatcat:limm2psnhnarzm2izlniebpmf4

Coherence Stalls or Latency Tolerance: Informed CPU Scheduling for Socket and Core Sharing

Sharanyan Srikanthan, Sandhya Dwarkadas, Kai Shen
2016 USENIX Annual Technical Conference  
and improves performance by up to 61% over the default Linux scheduler for mixed workloads.  ...  The efficiency of modern multiprogrammed multicore machines is heavily impacted by traffic due to data sharing and contention due to competition for shared resources.  ...  We also thank the anonymous USENIX ATC reviewers and our shepherd Andy Tucker for comments that helped improve this paper.  ... 
dblp:conf/usenix/SrikanthanDS16 fatcat:zzatm645wzef5lvzndjshgpkja

Prefetch-aware shared resource management for multi-core systems

Eiman Ebrahimi, Chang Joo Lee, Onur Mutlu, Yale N. Patt
2011 Proceeding of the 38th annual international symposium on Computer architecture - ISCA '11  
We show that our mechanisms improve the performance of a 4-core system that uses network fair queuing, parallelism-aware batch scheduling, and fairness via source throttling by 11.0%, 10.9%, and 11.3%  ...  Recent proposals have addressed high-performance and fair management of these shared resources; however, none of them take into account prefetch requests.  ...  We gratefully acknowledge the support of the Cockrell Foundation, Intel, AMD, Gigascale Systems Research Center, and CyLab. This research was partially supported by NSF CA-REER Award CCF-0953246.  ... 
doi:10.1145/2000064.2000081 dblp:conf/isca/EbrahimiLMP11 fatcat:fc5wldr4j5drzoi6tdy3tcjt4a

Improving Preemptive Scheduling with Application-Transparent Checkpointing in Shared Clusters

Jack Li, Calton Pu, Yuan Chen, Vanish Talwar, Dejan Milojicic
2015 Proceedings of the 16th Annual Middleware Conference on - Middleware '15  
In such shared environments, cluster schedulers typically utilize preemption by simply killing jobs in order to achieve resource priority and fairness during peak utilization.  ...  Instead of killing preempted jobs or tasks, our approach uses a system level, application-transparent checkpointing mechanism to save the progress of jobs for resumption at a later time when resources  ...  Government, and Georgia Tech Foundation through the John P.  ... 
doi:10.1145/2814576.2814807 dblp:conf/middleware/LiPCTM15 fatcat:2w4fmqvjvzgknfszyzq4xpiwp4

PScheD Political scheduling on the CRAY T3E [chapter]

Richard N. Lagerstrom, Stephan K. Gipp
1997 Lecture Notes in Computer Science  
The meaning of Political Scheduling is dened, we present a general overview of the Cray T3E hardware and operating system and describe the current implementation of the Political Scheduling feature of  ...  This paper describes the components that help realize the Political Scheduling goals of the CRAY T3E system.  ...  Kernel support consists of setting aside a range of priorities named gang priorities, making the thread scheduler and memory manager in each kernel aware of these priorities and enhancing an existing system  ... 
doi:10.1007/3-540-63574-2_19 fatcat:62pmukmxe5ar7nxc6iisr2iv6q
« Previous Showing results 1 — 15 out of 303 results