A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Utility-based acceleration of multithreaded applications on asymmetric CMPs
2013
Proceedings of the 40th Annual International Symposium on Computer Architecture - ISCA '13
This paper proposes Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs (UBA), a cooperative software/hardware mechanism for identifying and accelerating the most likely critical ...
code segments from a set of multithreaded applications running on an ACMP. ...
We gratefully acknowledge the support of the Cockrell Foundation and Intel Corporation. ...
doi:10.1145/2485922.2485936
dblp:conf/isca/JoaoSMP13
fatcat:js6dwddr25ao7h5rr2tx4lfxvq
Utility-based acceleration of multithreaded applications on asymmetric CMPs
2013
SIGARCH Computer Architecture News
This paper proposes Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs (UBA), a cooperative software/hardware mechanism for identifying and accelerating the most likely critical ...
code segments from a set of multithreaded applications running on an ACMP. ...
We gratefully acknowledge the support of the Cockrell Foundation and Intel Corporation. ...
doi:10.1145/2508148.2485936
fatcat:vt6gk2d5ibbyjjzhyk4pyo3s34
2019 Index IEEE Computer Architecture Letters Vol. 18
2020
IEEE computer architecture letters
-June 2019 59-62 J Job shop scheduling Performance and Fairness Improvement on CMPs Considering Bandwidth and Cache Utilization. ...
Jeon, H., +, LCA July-Dec. 2019 153-156 Performance and Fairness Improvement on CMPs Considering Bandwidth and Cache Utilization. ...
doi:10.1109/lca.2020.2964168
fatcat:pv44gn35vrb75jabsid7x62xpm
The Coming Wave of Multithreaded Chip Multiprocessors
2007
International journal of parallel programming
To address these limits, the computer industry has embraced chip multiprocessing (CMP), predominately in the form of multiple high-performance superscalar processors on the same die. ...
on-chip shared secondary cache allows for more fine-grain parallelism to be effectively exploited by the CMP. ...
Fig. 11 shows a comparison of SPECjbb 2005 results between the Niagara-based SunFire T2000 and three IBM systems based on CMPs using more conventional superscalar POWER or x86 cores: the IBM p550, IBM ...
doi:10.1007/s10766-007-0033-6
fatcat:4gzhbtdumvablcjfy62osfb2g4
A Survey on Hardware and Software Support for Thread Level Parallelism
[article]
2016
arXiv
pre-print
Hardware support at execution time is very crucial to the performance of the system, thus different types of hardware support for threads also exist or have been proposed, primarily based on widely used ...
Todays computers are built upon multiple processing cores and run applications consisting of a large number of threads, making runtime thread management a complex process. ...
Adding multithreading support to reconfigurable architecture (mainly based on FPGAs) is complex due to its application based flexibility. ...
arXiv:1603.09274v3
fatcat:75isdvgp5zbhplocook6273sq4
History-aware, resource-based dynamic scheduling for heterogeneous multi-core processors
2011
IET Computers & Digital Techniques
HARD relies on recording application resource utilization and throughput to adaptively change cores for applications during runtime. ...
We compare HARD to a complexity-based static scheduler and show that HARD outperforms this alternative. ...
In [10] Accelerated Critical Sections (ACS) is introduced which leverages the high-performance core(s) of an Asymmetric Chip Multiprocessor (ACMP) to accelerate the execution of critical sections. ...
doi:10.1049/iet-cdt.2009.0045
fatcat:h5yd2rnaone47gcuph6jwxhcmu
Scheduling Algorithms for Asymmetric Multi-core Processors
[article]
2017
arXiv
pre-print
Research advocates asymmetric multi-core processor system for better utilization of chip real state. ...
This paper explores some representative algorithms of these classes to get an overview of scheduling algorithms for asymmetric multicore system. ...
CAMP dynamically selects which utility threshold to use based on system workload. There are two pairs of utility thresholds used. ...
arXiv:1702.04028v1
fatcat:uwvezkeg5bazzaruyglkcilqyi
Bottleneck identification and scheduling in multithreaded applications
2012
Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '12
on an Asymmetric Chip Multi-Processor (ACMP). ...
Performance of multithreaded applications is limited by a variety of bottlenecks, e.g. critical sections, barriers and slow pipeline stages. ...
Acknowledgments We thank Eiman Ebrahimi, Veynu Narasiman, Santhosh Srinath, other members of the HPS research group, our shepherd Ras Bodik and the anonymous reviewers for their comments and suggestions ...
doi:10.1145/2150976.2151001
dblp:conf/asplos/JoaoSMP12
fatcat:iwf4i7vfy5gx7gddspyjjtun3u
Bottleneck identification and scheduling in multithreaded applications
2012
SIGARCH Computer Architecture News
on an Asymmetric Chip Multi-Processor (ACMP). ...
Performance of multithreaded applications is limited by a variety of bottlenecks, e.g. critical sections, barriers and slow pipeline stages. ...
Acknowledgments We thank Eiman Ebrahimi, Veynu Narasiman, Santhosh Srinath, other members of the HPS research group, our shepherd Ras Bodik and the anonymous reviewers for their comments and suggestions ...
doi:10.1145/2189750.2151001
fatcat:gdo2bg5wpvbxho53a5cg6mctte
Power challenges may end the multicore era
2013
Communications of the ACM
The low utility of this "dark silicon" may prevent both scaling to higher core counts and ultimately the economic viability of continued silicon scaling. ...
Under these conditions, more cores are only possible if the cores are slower, simpler, or less utilized with each additional technology generation. ...
Asymmetric multicore. The asymmetric multicore topology consists of one large monolithic core and many identical small cores. ...
doi:10.1145/2408776.2408797
fatcat:s3kiqdlmrjaopmrwifwdsz7tdm
Sharing the instruction cache among lean cores on an asymmetric CMP for HPC applications
2017
2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
Current supercomputing systems with heterogeneous or asymmetric CMPs (ACMP) combine few high-performance big cores for serial regions, together with many low-power lean cores for throughput computing. ...
This paper analyzes the performance, power and area impact of such a design on an ACMP with one high-performance core and multiple low-power cores. ...
We compare two symmetric CMPs, one with four big and one with 16 small cores, to an asymmetric CMP with one big and 12 small cores. ...
doi:10.1109/ispass.2017.7975265
dblp:conf/ispass/MilicRCR17
fatcat:ubmgox4ir5fczaiz4efhg62ctu
Enabling Network Security in HPC Systems Using Heterogeneous CMPs
[chapter]
2014
High-Performance Computing on Complex Environments
However, homogeneous CMPs provide only one type of core to match various application requirements, thereby not fully utilizing the available chip area and power budget. ...
More specifically, we propose an integer linear programming (ILP)-based methodology to mathematically analyze and provide heterogeneous CMP architectures and task High-Performance Computing on Complex ...
ACKNOWLEDGMENTS This work was supported in part by Open European Network for High-Performance Computing on Complex Environments, the TUBITAK Grant 112E360, and a grant from Turk Telekom under Grant 3015 ...
doi:10.1002/9781118711897.ch20
fatcat:apzx5plqi5cfhcfv5bise7fcbu
Study and evaluation of an Irregular Graph Algorithm on Multicore and GPU Processor Architectures
[article]
2016
arXiv
pre-print
One area of Computing applications which poses significant challenge of performance scalability on Chip Multiprocessors(CMP's) are Irregular applications. ...
algorithm on relatively lower-cost Multicore and GPGPU based platforms. ...
in a multithreaded application. ...
arXiv:1603.02655v1
fatcat:nklt3op66vdfdpmd3ygeckhwla
Leveraging Core Specialization via OS Scheduling to Improve Performance on Asymmetric Multicore Systems
2012
ACM Transactions on Computer Systems
To demonstrate the effectiveness of CAMP on more realistic scenarios, we augmented the CAMP scheduler with a model that predicts the speedup factor on a real AMP prototype that closely matches future asymmetric ...
To deliver this potential to unmodified applications, the OS scheduler must map threads to cores in consideration of the properties of both. ...
As explained below, multithreaded applications will most often fall into the LOW utility class. ...
doi:10.1145/2166879.2166880
fatcat:vqn6pw2ugjaizaxdelnf7d4fme
Portable performance on asymmetric multicore processors
2016
Proceedings of the 2016 International Symposium on Code Generation and Optimization - CGO 2016
Applying these criteria effectively is challenging especially for complex and non-scalable multithreaded applications. ...
Performance advantages are robust to a complex multithreaded adversary independently scheduled by the OS. WASH effectively identifies and optimizes a wider class of workloads than prior work. ...
We rank threads based on their relative capacity to retire instructions, seeking to accelerate threads that dominate in terms of productive work (line 16). ...
doi:10.1145/2854038.2854047
dblp:conf/cgo/JibajaCBM16
fatcat:dgdx7swpu5eafhze3wufc5euwu
« Previous
Showing results 1 — 15 out of 118 results