A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
A scheduling policy for preserving cache locality in a multiprogrammed system
2000
Journal of systems architecture
To solve this requirement, we propose a preemption-safe policy to exploit the cache locality of blocked programs in a multiprogrammed system. ...
In a multiprogrammed system, when the operating system switches contexts, in addition to the cost for handling the processes being swapped out and in, the cache performance of processors also can be aected ...
Acknowledgements This work was supported in part by National Research Laboratory Program funded by Ministry of Science and Technology and university S/W research center program by Ministry of Information ...
doi:10.1016/s1383-7621(00)00020-5
fatcat:bgqgndhm45defdhs2ylfos3nxe
Realistic Workload Scheduling Policies for Taming the Memory Bandwidth Bottleneck of SMPs
[chapter]
2004
Lecture Notes in Computer Science
In this paper we reformulate the thread scheduling problem on multiprogrammed SMPs. ...
Therefore, we present and evaluate two realistic scheduling policies which treat memory bandwidth as a first-class resource. ...
Acknowledgements The first author is supported by a grant from 'Alexander S. Onassis' public benefit foundation and the European Commission through IST grant No. 2001-33071. ...
doi:10.1007/978-3-540-30474-6_33
fatcat:elyq5hocivhazeb54xkn4imndi
Processor allocation policies for message-passing parallel computers
1994
Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems - SIGMETRICS '94
When multiple jobs compete for processing resources qThis materiaf is ...
Acknowledgements The authors thank Martin Tompa for valuable discussions regarding the analysis of the Folding rotation scheme. ...
In contrast,
Equipartition
shows some tendency
to-
wards a local minimum,
especially
for high loads. ...
doi:10.1145/183018.183022
dblp:conf/sigmetrics/McCannZ94
fatcat:zi3jy5wq3baelpwdzw76n4dszm
Processor allocation policies for message-passing parallel computers
1994
Performance Evaluation Review
When multiple jobs compete for processing resources qThis materiaf is ...
Acknowledgements The authors thank Martin Tompa for valuable discussions regarding the analysis of the Folding rotation scheme. ...
In contrast,
Equipartition
shows some tendency
to-
wards a local minimum,
especially
for high loads. ...
doi:10.1145/183019.183022
fatcat:hlhwokoafra23ed4z67wl4hizy
Scheduling algorithms with bus bandwidth considerations for SMPs
2003
2003 International Conference on Parallel Processing, 2003. Proceedings.
The new scheduling policies improve system throughput by up to 68% (26% in average) in comparison with the standard Linux scheduler. ...
However, both software and scheduling policies for these systems generally focus on memory hierarchy optimizations and do not address the bus bandwidth limitations directly. ...
The same philosophy is followed in SMP operating systems for scheduling multiprogrammed workloads with time-sharing. All SMP schedulers use cache affinity links for each thread. ...
doi:10.1109/icpp.2003.1240622
dblp:conf/icpp/AntonopoulosNP03
fatcat:vqmmhe3ztfhzxbx5ymk5fn7q54
Scheduling Algorithms with Bus Bandwidth Considerations for SMPs
[chapter]
2006
High-Performance Computing
The new scheduling policies improve system throughput by up to 68% (26% in average) in comparison with the standard Linux scheduler. ...
However, both software and scheduling policies for these systems generally focus on memory hierarchy optimizations and do not address the bus bandwidth limitations directly. ...
The same philosophy is followed in SMP operating systems for scheduling multiprogrammed workloads with time-sharing. All SMP schedulers use cache affinity links for each thread. ...
doi:10.1002/0471732710.ch16
fatcat:oq2pxgeq2bbmnclpyngg5tw4k4
Adaptive two-level thread management for fast MPI execution on shared memory machines
1999
Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '99
There is also work on OS scheduling to exploit cache affinity [30] . We combine these two ideas together and extend them for the MPI runtime system. ...
SGI machines, thread yielding and resumption is cumbersome and fairly slow (e.g. the thread yield function resumes a kernel thread in a non-deterministic manner and the shortest sleep interval for a nanosleep ...
We would like to thank Bill Gropp, Eric Salo, and anonymous referees for their helpful comments, and Claus Jeppesen for his help in using Origin 2000 at UCSB. ...
doi:10.1145/331532.331581
dblp:conf/sc/ShenTY99
fatcat:x3sbdt6cmnclzgxftkoavp4v3u
A Workload-Adaptive and Reconfigurable Bus Architecture for Multicore Processors
2010
International Journal of Reconfigurable Computing
In this paper, we first motivate the need for workload-adaptive interconnection networks. ...
Interconnection networks for multicore processors are traditionally designed to serve a diversity of workloads. ...
The IO cache snoops the requests in a similar Figure 16 : The system architecture used for full-system mode simulation. manner as the rest of L2 caches in the system. ...
doi:10.1155/2010/205852
fatcat:xzmi24sg3zgbzok3rrfyrowtha
Cache restoration for highly partitioned virtualized systems
2012
IEEE International Symposium on High-Performance Comp Architecture
While most systems allow for partitioning at the relatively coarse grain of a single core, some systems also support multiprogrammed virtualization, whereby a system can be more finely partitioned through ...
Through cycle-accurate simulation of a POWER7 system, we show that when applied to its private per-core L3 last-level cache, the warm cache translates into 20% on average performance improvement for a ...
For example, in a typical Nehalem EX system, an L3 cache miss requires 79 ns to receive the data from DRAM. ...
doi:10.1109/hpca.2012.6169029
dblp:conf/hpca/DalyC12
fatcat:oj3lparhifffnmfytdnhdbwjbu
The implications of cache affinity on processor scheduling for multiprogrammed, shared memory multiprocessors
1991
Proceedings of the thirteenth ACM symposium on Operating systems principles - SOSP '91
A scheduling policy that ignores this affinity may waste processing power by causing excessive cache refilling. ...
In a shared memory multiprocessor with caches, executing tasks develop "affinity" to processors by filling their caches with data and instructions during execution. ...
Using an analytic model of cache footprint behavior, and an analytic model of a multiprogrammed system and its workload, they concluded that affinity scheduling can have a pronounced effect on performance ...
doi:10.1145/121132.121140
dblp:conf/sosp/VaswaniZ91
fatcat:c2jnnrx5mnat3no5tmuepnm2hy
A Transparent Operating System Infrastructure for Embedding Adaptability to Thread-Based Programming Models
[chapter]
2001
Lecture Notes in Computer Science
Our experiments show that using these services in a multiprogrammed SMP yields a throughput improvement of up to 41.2%. ...
This paper defines a unified set of services, implemented at the operating system level, which can be used to embed adaptability in any thread-based programming paradigm. ...
Local scheduling is also performed if a thread controlled by our scheduler is dequeued from the run-queue of the native scheduler. ...
doi:10.1007/3-540-44681-8_75
fatcat:kc6gucwhzjg6pcb6le7aixdtdm
Flex memory: Exploiting and managing abundant off-chip optical bandwidth
2011
2011 Design, Automation & Test in Europe
To further preserve locality and maintain service parallelism for different workloads, page folding technique is employed to achieve adaptive data mapping in photonics-connected DRAM chips via optical ...
However, current DRAM organization has mainly been optimized for a higher storage capacity and package pin utilization. ...
Both open-page and close-page management policies with first-ready-first-come-first-serve (FR-FCFS) and batching scheduling are evaluated. The simulated system is organized as shown in Table II. ...
doi:10.1109/date.2011.5763157
dblp:conf/date/WangZHLL11
fatcat:v2d3w7fjxrfhndfnmukxsdiag4
Informing algorithms for efficient scheduling of synchronizing threads on multiprogrammed SMPs
2001
International Conference on Parallel Processing, 2001.
The applications are given the opportunity to influence, in a non-intrusive manner, the scheduling decisions concerning their threads. ...
We present novel algorithms for efficient scheduling of synchronizing threads on multiprogrammed SMPs. The algorithms are based on intra-application priority control of synchronizing threads. ...
The problem of scheduling synchronizing threads in a multiprogramming environment has not been adequately addressed, if at all, in contemporary commercial SMP schedulers for small-and medium-scale systems ...
doi:10.1109/icpp.2001.952054
dblp:conf/icpp/AntonopoulosNP01
fatcat:itlttyxpynbrrpariy3642ori4
Optimizing virtual machine scheduling in NUMA multicore systems
2013
2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA)
We propose a Bias Random vCPU Migration (BRM) algorithm that dynamically migrates vCPUs to minimize the system-wide uncore penalty. We have implemented the scheme in the Xen virtual machine monitor. ...
Experiment results on a two-way Intel NUMA multicore system with various workloads show that BRM is able to improve application performance by up to 31.7% compared with the default Xen credit scheduler ...
Acknowledgements We are grateful to the anonymous reviewers for their constructive comments. This research was supported in part by the U.S. ...
doi:10.1109/hpca.2013.6522328
dblp:conf/hpca/RaoWZX13
fatcat:jz2yyayw4jgzbozgdvvj7hsivm
The Locality Principle
[chapter]
2006
Communication Networks and Computer Systems
It remains a rich source of inspirations for contemporary research in architecture, caching, Bayesian inference, forensics, web-based business processes, context-aware software, and network science. ...
Locality is among the oldest systems principles in computer science. It was discovered in 1967 during efforts to make early virtual memory systems work well. ...
A feedback control system can stabilize the multiprogramming level and prevent thrashing. The amount of free space is monitored and fed back to the scheduler. ...
doi:10.1142/9781860948947_0004
fatcat:t2wgrmpozja7lj2fqsmwq5ignm
« Previous
Showing results 1 — 15 out of 393 results