Filters








5,983 Hits in 5.7 sec

Fine-grain priority scheduling on multi-channel memory systems

Zhichun Zhu, Zhao Zhang, Xiaodong Zhang
Proceedings Eighth International Symposium on High Performance Computer Architecture  
In this study we show that, by utilizing fine-grain priority access scheduling, we are able to find a workload independent configuration that achieves optimal performance on a multichannel memory system  ...  priority scheduling is about 13% and 8% for a 2-channel and a 4-channel Direct Rambus DRAM memory systems, respectively, compared with gang scheduling.  ...  In this study we show that, by utilizing fine-grain priority access scheduling, we are able to find a workload independent configuration that achieves optimal performance on a multi-channel memory system  ... 
doi:10.1109/hpca.2002.995702 dblp:conf/hpca/ZhuZZ02 fatcat:z7vsmsjbcjcinau5fjuqsuzg5a

Adaptive granularity memory systems

Doe Hyun Yoon, Min Kyu Jeong, Mattan Erez
2011 Proceeding of the 38th annual international symposium on Computer architecture - ISCA '11  
We use sector caches and sub-ranked memory systems to implement adaptive granularity. We also show how to incorporate adaptive granularity into memory access scheduling.  ...  We propose adaptive granularity to combine the best of finegrained and coarse-grained memory accesses.  ...  This prevents fine-grained requests with normal priority from being scheduled. As a result, the coarse-grained request finishes at time 9, after which fine-grained requests are serviced.  ... 
doi:10.1145/2000064.2000100 dblp:conf/isca/YoonJE11 fatcat:ty3xafhgl5dxbj3kz5v2uor3da

Adaptive granularity memory systems

Doe Hyun Yoon, Min Kyu Jeong, Mattan Erez
2011 SIGARCH Computer Architecture News  
We use sector caches and sub-ranked memory systems to implement adaptive granularity. We also show how to incorporate adaptive granularity into memory access scheduling.  ...  We propose adaptive granularity to combine the best of finegrained and coarse-grained memory accesses.  ...  This prevents fine-grained requests with normal priority from being scheduled. As a result, the coarse-grained request finishes at time 9, after which fine-grained requests are serviced.  ... 
doi:10.1145/2024723.2000100 fatcat:kwb7uoqcdnh4bhwaznn46udryq

MIMS: Towards a Message Interface Based Memory System

Li-Cheng Chen, Ming-Yu Chen, Yuan Ruan, Yong-Bing Huang, Ze-Han Cui, Tian-Yue Lu, Yun-Gang Bao
2014 Journal of Computer Science and Technology  
The memory system is more intelligent and active by equipping with a local buffer scheduler, which is responsible to process packet, schedule memory requests, and execute specific commands with the help  ...  A novel message interface based memory system (MIMS) is proposed. The key innovation of MIMS is that processor and memory system communicate through a universal and flexible message interface.  ...  Yoon et al. proposed adaptive granularity memory systems (AGMS) [4] to combine the best of fine-grained and coarse-grained memory accesses.  ... 
doi:10.1007/s11390-014-1428-7 fatcat:ywlupx7r2vhsnngdh3uozw4omy

Compiler Management of Communication and Parallelism for Quantum Computation

Jeff Heckey, Shruti Patil, Ali JavadiAbhari, Adam Holmes, Daniel Kudrow, Kenneth R. Brown, Diana Franklin, Frederic T. Chong, Margaret Martonosi
2015 Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '15  
The Multi-SIMD model consists of a small number of SIMD regions, each of which may support operations on up to thousands of qubits per cycle.  ...  Efficient Multi-SIMD operation requires efficient scheduling.  ...  The invoked modules have been previously scheduled by one of the fine-grained schedulers discussed next and are now treated as blackbox functions.  ... 
doi:10.1145/2694344.2694357 dblp:conf/asplos/HeckeyPJHKBFCM15 fatcat:h5w7skxobrgfvkatjafy7abfdq

Compiler Management of Communication and Parallelism for Quantum Computation

Jeff Heckey, Shruti Patil, Ali JavadiAbhari, Adam Holmes, Daniel Kudrow, Kenneth R. Brown, Diana Franklin, Frederic T. Chong, Margaret Martonosi
2015 SIGARCH Computer Architecture News  
The Multi-SIMD model consists of a small number of SIMD regions, each of which may support operations on up to thousands of qubits per cycle.  ...  Efficient Multi-SIMD operation requires efficient scheduling.  ...  The invoked modules have been previously scheduled by one of the fine-grained schedulers discussed next and are now treated as blackbox functions.  ... 
doi:10.1145/2786763.2694357 fatcat:k2ggawoqzfh3derltem67omu34

Compiler Management of Communication and Parallelism for Quantum Computation

Jeff Heckey, Shruti Patil, Ali JavadiAbhari, Adam Holmes, Daniel Kudrow, Kenneth R. Brown, Diana Franklin, Frederic T. Chong, Margaret Martonosi
2015 SIGPLAN notices  
The Multi-SIMD model consists of a small number of SIMD regions, each of which may support operations on up to thousands of qubits per cycle.  ...  Efficient Multi-SIMD operation requires efficient scheduling.  ...  The invoked modules have been previously scheduled by one of the fine-grained schedulers discussed next and are now treated as blackbox functions.  ... 
doi:10.1145/2775054.2694357 fatcat:sb66vo3gf5gcfdcrlbwc6ult5a

Real-Time Evaluation of nMPRA CPU Architecture Based on Multithreaded Execution

Ionel Zagan
2015 International Journal of Computer and Electrical Engineering  
We describe the real-time scheduling tests on nMPRA processor architecture, including also a fine-grained multithreading configuration.  ...  This paper conducts a thorough study of the schedulability and predictability questions for a custom designed CPU architecture, named Multi Pipeline Register Architecture (nMPRA).  ...  We present the experimental tests executed on a five stage pipeline architecture named nMPRA-MT (Multi Pipeline Register Architecture-Fine-grained Multithreading) [6] .  ... 
doi:10.17706/ijcee.2015.7.6.424-429 fatcat:6uyuvt5d4ne2re3frfhg464txq

Improving the Performance of CPU Architectures by Reducing the Operating System Overhead (Extended Version)

Ionel Zagan, Vasile Gheorghita Gaitan
2016 Electrical, Control and Communication Engineering  
This paper focuses on the innovative CPU implementation named nMPRA-MT, designed for small real-time applications.  ...  New requirements also arise for a real-time operating system used in mixed-criticality systems, when the executions of hard real-time applications require timing predictability.  ...  ACKNOWLEDGMENT This work was partially supported from the project "Integrated Center for research, development and innovation in Advanced Materials, Nanotechnologies, and Distributed Systems for fabrication  ... 
doi:10.1515/ecce-2016-0002 fatcat:5vgjib43orcgzeqzpucuncjt4i

Mixed-Critical Systems Design with Coarse-Grained Multi-core Interference [chapter]

Peter Poplavko, Rany Kahil, Dario Socci, Saddek Bensalem, Marius Bozga
2016 Lecture Notes in Computer Science  
of system functions priority over low-critical ones in emergency situations.  ...  Another challenge comes from the fact that for modern platforms -multi-and many-coresmake the scheduling problem more complicated because of their inherent parallelism and because of "parasitic" interference  ...  Interference can be coarse-grain or fine-grain.  ... 
doi:10.1007/978-3-319-47166-2_42 fatcat:pzkiklqvtrcrbcasnbfslmcavm

Fairness issues in software virtual routers

Norbert Egi, Adam Greenhalgh, Mark Handley, Mickael Hoerdt, Felipe Huici, Laurent Mathy
2008 Proceedings of the ACM workshop on Programmable routers for extensible services of tomorrow - PRESTO '08  
Using commodity x86 hardware we show that it is viable to run highly experimental and untrusted router systems along side a production router on the same hardware platform without sacrificing performance  ...  FP Multi-queue NIC -Low(33%) priority FP Multi-queue NIC -High(67%) priority FP Per-flow forwarding rate on single-queue and (emulated) multi-queue NICs with forwarding path priorities (1:2) Figure 4  ...  These systems have a uniform memory architecture, with 8GB of 667MHz DDR2 memory connected to the north bridge via four 5.3 GB/s channels.  ... 
doi:10.1145/1397718.1397726 dblp:conf/sigcomm/EgiGHHHM08 fatcat:yh77n5iy3fe6hcxmlwtzdnpgxy

Large exploration for HW/SW partitioning of multirate and aperiodic real-time systems

Abdenour Azzedine, Jean-Philippe Diguet, Jean-Luc Pillippe
2002 Proceedings of the tenth international symposium on Hardware/software codesign - CODES '02  
This paper addresses the domain of fine and coarse grain HW / SW codesign for Real-Time System On-Chip.  ...  We propose a new method for the real-time scheduling and the HW / SW partitioning of multi-rate or aperiodic tasks. The large design space exploration is based on parallelism/delay trade-off curves.  ...  The method is based on the fixed priority scheduling theory [14, 16] , however the design space exploration is limited to a fine grain solution (processor + coprocessor) or separately to a coarse grain  ... 
doi:10.1145/774801.774807 fatcat:pwlglrpf7zew5mm7ifokiddoqu

Large exploration for HW/SW partitioning of multirate and aperiodic real-time systems

Abdenour Azzedine, Jean-Philippe Diguet, Jean-Luc Pillippe
2002 Proceedings of the tenth international symposium on Hardware/software codesign - CODES '02  
This paper addresses the domain of fine and coarse grain HW / SW codesign for Real-Time System On-Chip.  ...  We propose a new method for the real-time scheduling and the HW / SW partitioning of multi-rate or aperiodic tasks. The large design space exploration is based on parallelism/delay trade-off curves.  ...  The method is based on the fixed priority scheduling theory [14, 16] , however the design space exploration is limited to a fine grain solution (processor + coprocessor) or separately to a coarse grain  ... 
doi:10.1145/774789.774807 dblp:conf/codes/AzzedineDP02 fatcat:5adp5xtkpncetcljxtsk4sfcxu

TurboDL: Improving CNN Training on GPU with Fine-grained Multi-streaming Scheduling

Hai Jin, Wenchao Wu, Xuanhua Shi, Ligang He, Bing B Zhou
2020 IEEE transactions on computers  
Unlike previous research which mainly focuses on single layer or coarse-grained optimization, we introduce a critical-path based, asynchronous parallelization mechanism, and propose the optimization technique  ...  It is challenging to orchestrate concurrency for CNN (convolutional neural networks) training on GPUs since it may introduce synchronization overhead and poor resource utilization.  ...  Data Dependency Analysis and Fine-Grained Task Partitioning Multi-stream concurrency is an efficient way to utilize the characteristic of GPUs multi-cores.  ... 
doi:10.1109/tc.2020.2990321 fatcat:vf24gwreavfbtfllq7zjcic2om

Memory Access Scheduling Schemes for Systems with Multi-Core Processors

Hongzhong Zheng, Jiang Lin, Zhao Zhang, Zhichun Zhu
2008 2008 37th International Conference on Parallel Processing  
On systems with multi-core processors, the memory access scheduling scheme plays an important role not only in utilizing the limited memory bandwidth but also in balancing the program execution on all  ...  We have also thoroughly evaluated a set of memory scheduling schemes that differentiate and prioritize requests from different cores.  ...  Fine-grain scheduling schemes [3, 20] can effectively improve the per-formance of multi-channel memory systems.  ... 
doi:10.1109/icpp.2008.53 dblp:conf/icpp/ZhengLZZ08 fatcat:m4ohycfxbfcm5mgr6hkxilv6pi
« Previous Showing results 1 — 15 out of 5,983 results