110 Hits in 5.9 sec

Guest editorial: special issue on mixed-criticality, multi-core, and micro-kernels

Robert I. Davis
2017 Real-time systems  
With the adoption of multicore technology comes the opportunity to combine different applications on the same hardware platform, reducing size, weight and power consumption, as well as assembly and production  ...  By contending for these shared hardware resources, tasks executing on one core can potentially interfere with tasks executing on another core, increasing their worst-case execution times.  ...  platform more predictable, for those criticality levels that need it.  ... 
doi:10.1007/s11241-017-9288-1 fatcat:zcmfabpzj5amjgracewu6wbud4

Guest editorial: Special Issue on Predictable multi-core systems

Robert I. Davis
2020 Real-time systems  
Contention for shared hardware resources thus poses a significant challenge in the development of predictable hard real-time systems running on multi-core platforms.  ...  Three concepts that are useful in a discussion of the real-time behaviour of multicore systems are Timing Composability, Timing Compositionality, and Timing Predictability.  ...  Contention for shared hardware resources thus poses a significant challenge in the development of predictable hard real-time systems running on multi-core platforms.  ... 
doi:10.1007/s11241-020-09348-x fatcat:4tiy62twjvfozgehybg3pqatlu

A Flexible Framework for Throttling-Enabled Multicore Management (TEMM)

Xiao Zhang, Rongrong Zhong, Sandhya Dwarkadas, Kai Shen
2012 2012 41st International Conference on Parallel Processing  
Within each iteration, TEMM extrapolates the effects of throttling from reference configurations, searches for a high-quality throttling configuration based on model predictions (accelerated by hill climbing  ...  This paper proposes a flexible framework for Throttling-Enabled Multicore Management (TEMM) that efficiently finds a high-quality hardware execution throttling configuration for a user-specified resource  ...  [26] use an additional layer of translation to control the placement of pages in a multicore shared cache. Mutlu et al.  ... 
doi:10.1109/icpp.2012.8 dblp:conf/icpp/ZhangZDS12 fatcat:qjynudh42vefjk6r6qua7qvxta

An extensible framework for multicore response time analysis

Robert I. Davis, Sebastian Altmeyer, Leandro S. Indrusiak, Claire Maiza, Vincent Nelis, Jan Reineke
2017 Real-time systems  
An extensible framework for multicore response time analysis Davis, R.I.; Altmeyer, S.J.; Indrusiak, L.S.; Maiza, C.; Nelis, V.; Reineke, J.  ...  direct-mapped, shared L2 instruction caches on multicores.  ...  This method is more complex than the one proposed in this paper, and may be more accurate when it estimates the delay due to the shared bus; however, it assumes partitioned caches and therefore does not  ... 
doi:10.1007/s11241-017-9285-4 fatcat:dg6qbbdzfnajxlbsje57ku2ryi

Evaluation and optimization of multicore performance bottlenecks in supercomputing applications

Jeff Diamond, Martin Burtscher, John D. McCalpin, Byoung-Do Kim, Stephen W. Keckler, James C. Browne
This paper first examines traditional unicore metrics and demonstrates how they can be misleading in a multicore system.  ...  The measurement and analysis process is based on a case study of the HOMME atmospheric modeling benchmark code from NCAR running on supercomputers built upon AMD Barcelona and Intel Nehalem quad-core processors  ...  This work was supported in part by the National Science Foundation under award CCF-0916745 and OCI award 0622780.  ... 
doi:10.1109/ispass.2011.5762713 dblp:conf/ispass/DiamondBMKKB11 fatcat:syljgxaqtrcbdkcq2dyoin3tce

Bounding Worst-Case Performance for Multi-Core Processors with Shared L2 Instruction Caches

Jun Yan, Wei Zhang
2011 Journal of Computing Science and Engineering  
As the first step toward real-time multi-core computing, this paper presents a novel approach to bounding the worst-case performance for threads running on multi-core processors with shared L2 instruction  ...  caches.  ...  L2 caches with the same total size, because each core with a shared L2 cache can possibly make use of the aggregate L2 cache space more efficiently.  ... 
doi:10.5626/jcse.2011.5.1.001 fatcat:hcv73i6rajelfjkhqwt2hmbmvy

Who Is Your Neighbor: Net I/O Performance Interference in Virtualized Clouds

Xing Pu, Ling Liu, Yiduo Mei, Sankaran Sivathanu, Younggyun Koh, Calton Pu, Yuanda Cao
2013 IEEE Transactions on Services Computing  
In this paper, we present the experimental research on performance interference in parallel processing of CPU-intensive and network-intensive workloads on Xen virtual machine monitor (VMM).  ...  The more CPUs pinned on Dom0 the worse performance is achieved by CPU-intensive workload. Last, due to fast I/O processing in I/O channel, limitation on grant table is a potential bottleneck in Xen.  ...  multiple alternative hardware platforms, ranging from a dual-core CPU with small L2 cache platform to single-core CPU with large L2 cache to a multicore CPU platform.  ... 
doi:10.1109/tsc.2012.2 fatcat:ewxgqxkbwrftdcfjmupii37bna

High performance MPEG-2 software decoder on the cell broadband engine

David A. Bader, Sulabh Patel
2008 Proceedings, International Parallel and Distributed Processing Symposium (IPDPS)  
We give an experimental study on Sony PlayStation 3 and IBM QS20 dual-Cell Blade platforms.  ...  The Sony-Toshiba-IBM Cell Broadband Engine is a heterogeneous multicore architecture that consists of a traditional microprocessor (PPE) with eight SIMD coprocessing units (SPEs) integrated on-chip.  ...  Acknowledgments This work was supported in part by an IBM Shared University Research (SUR) award and NSF Grants CNS-0614915, CAREER CCF-0611589, and DBI-0420513.  ... 
doi:10.1109/ipdps.2008.4536234 dblp:conf/ipps/BaderP08a fatcat:i5lsol5lpfgb5gpgik4vqzowsi

Vineyard In The Hipeac Newsletter Info 45 [article]

Christoforos Kachris
2016 Zenodo  
VINEYARD will develop an integrated platform for energy-efficient data centres based on new servers with novel, coarse-grain and fine-grain, programmable hardware accelerators.  ...  The article can be found on page 16.  ...  "Effective knowledge sharing and open innovation will enable new exciting My research activity is focused on embedded systems and on virtualized, heterogeneous and manycore platforms and in particular  ... 
doi:10.5281/zenodo.836718 fatcat:tktuvwkcgfhqpiyfzcqwsk5rny

Bank-aware Dynamic Cache Partitioning for Multicore Architectures

Dimitris Kaseridis, Jeffrey Stuecheli, Lizy K. John
2009 2009 International Conference on Parallel Processing  
Results for an 8-core system show that our proposed scheme provides on average a 70% reduction in misses compared to non-partitioned shared caches, and a 25% misses reduction compared to static equally  ...  This enables sharing of computation resources that was not previously possible.  ...  This research was supported in part by NSF Award number 0702694 and IBM.  ... 
doi:10.1109/icpp.2009.55 dblp:conf/icpp/KaseridisSJ09 fatcat:2flwldg4c5fslfqky7rrgdfj6q

GPU Acceleration for Simulating Massively Parallel Many-Core Platforms

Shivani Raghav, Martino Ruggiero, Andrea Marongiu, Christian Pinto, David Atienza, Luca Benini
2015 IEEE Transactions on Parallel and Distributed Systems  
This paper presents a novel methodology to accelerate the simulation of many-core coprocessors using GPU platforms.  ...  Simulation of many target nodes is mapped to the many hardware-threads available on highly parallel GPU platforms.  ...  He received the IEEE CEDA Early Career Award in 2013, the ACM SIGDA Outstanding New Faculty Award in 2012, and a Faculty Award from Sun Labs at Oracle in 2011.  ... 
doi:10.1109/tpds.2014.2319092 fatcat:ngmczikqw5buvgxwd5askw3cpa

Supporting Address Translation for Accelerator-Centric Architectures

Yuchen Hao, Zhenman Fang, Glenn Reinman, Jason Cong
2017 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)  
This mechanism is based on our insight that the existing MMU cache in the CPU MMU satisfies the demand of customized accelerators with minimal overhead.  ...  Second, to compensate for the effects of the widely used data tiling techniques, we design a shared level-two TLB to serve private TLB misses on common virtual pages, eliminating duplicate page walks from  ...  , Intel, IBM Research Almaden and Mentor Graphics; and C-FAR, one of the six centers of STARnet, a Semiconductor Research Corporation program sponsored by MARCO and DARPA.  ... 
doi:10.1109/hpca.2017.19 dblp:conf/hpca/HaoFRC17 fatcat:cqywaosz3rddnkm4aiehc674cu

Locality Exists in Graph Processing: Workload Characterization on an Ivy Bridge Server

Scott Beamer, Krste Asanovic, David Patterson
2015 2015 IEEE International Symposium on Workload Characterization  
In this work, we analyze the performance characteristics of three highperformance graph algorithm codebases using hardware performance counters on a conventional dual-socket server.  ...  Based on our observations of simultaneous low compute and bandwidth utilization, we find there is substantial room for a different processor architecture to improve performance without requiring a new  ...  Any opinions, findings, conclusions, or recommendations in this paper are solely those of the authors and does not necessarily reflect the position or the policy of the sponsors.  ... 
doi:10.1109/iiswc.2015.12 dblp:conf/iiswc/BeamerAP15 fatcat:s5spuxorsnebdm7lulspnn6hwu

Systematic Approach for State-of-the-Art Architectures and System-on-chip Selection for Heterogeneous IoT Applications

Ramesh Krishnamoorthy, Kalimuthu Krishnan, Bharatiraja Chokkalingam, Sanjeevikumar Padmanaban, Zbigniew Leonowicz, Jens Bo Holm-Nielsen, Massimo Mitolo
2021 IEEE Access  
This paper seeks to comprehend the various IoT device specifications and their characteristics to support multiple applications.  ...  The proposed algorithm identifies the optimized SoC architecture concerning device parameters such as a clock, cache, RAM space, external storage, network support, etc.  ...  Lee Department Prize Paper Award, the IEEE-I&CPS 2015 Department Achievement Award, and the IEEE Region Six Outstanding Engineer Award.  ... 
doi:10.1109/access.2021.3055650 fatcat:n5yo3savcjdyxdolpwrlc5dza4

A case for FAME

Zhangxi Tan, Andrew Waterman, Henry Cook, Sarah Bird, Krste Asanović, David Patterson
2010 Proceedings of the 37th annual international symposium on Computer architecture - ISCA '10  
Given the multicore microprocessor revolution, we argue that the architecture research community needs a dramatic increase in simulation capacity.  ...  The simulation speedup achieved by our adoption of FAME-250×-enables experiments with more realistic time scales and data set sizes than are possible with SAME.  ...  of this paper.  ... 
doi:10.1145/1815961.1815999 dblp:conf/isca/TanWCBAP10 fatcat:4u2ves3qn5ckdhgocyzfplx4z4
« Previous Showing results 1 — 15 out of 110 results