Filters








118 Hits in 4.2 sec

Utility-based acceleration of multithreaded applications on asymmetric CMPs

José A. Joao, M. Aater Suleman, Onur Mutlu, Yale N. Patt
2013 Proceedings of the 40th Annual International Symposium on Computer Architecture - ISCA '13  
This paper proposes Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs (UBA), a cooperative software/hardware mechanism for identifying and accelerating the most likely critical  ...  code segments from a set of multithreaded applications running on an ACMP.  ...  We gratefully acknowledge the support of the Cockrell Foundation and Intel Corporation.  ... 
doi:10.1145/2485922.2485936 dblp:conf/isca/JoaoSMP13 fatcat:js6dwddr25ao7h5rr2tx4lfxvq

Utility-based acceleration of multithreaded applications on asymmetric CMPs

José A. Joao, M. Aater Suleman, Onur Mutlu, Yale N. Patt
2013 SIGARCH Computer Architecture News  
This paper proposes Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs (UBA), a cooperative software/hardware mechanism for identifying and accelerating the most likely critical  ...  code segments from a set of multithreaded applications running on an ACMP.  ...  We gratefully acknowledge the support of the Cockrell Foundation and Intel Corporation.  ... 
doi:10.1145/2508148.2485936 fatcat:vt6gk2d5ibbyjjzhyk4pyo3s34

2019 Index IEEE Computer Architecture Letters Vol. 18

2020 IEEE computer architecture letters  
-June 2019 59-62 J Job shop scheduling Performance and Fairness Improvement on CMPs Considering Bandwidth and Cache Utilization.  ...  Jeon, H., +, LCA July-Dec. 2019 153-156 Performance and Fairness Improvement on CMPs Considering Bandwidth and Cache Utilization.  ... 
doi:10.1109/lca.2020.2964168 fatcat:pv44gn35vrb75jabsid7x62xpm

The Coming Wave of Multithreaded Chip Multiprocessors

James Laudon, Lawrence Spracklen
2007 International journal of parallel programming  
To address these limits, the computer industry has embraced chip multiprocessing (CMP), predominately in the form of multiple high-performance superscalar processors on the same die.  ...  on-chip shared secondary cache allows for more fine-grain parallelism to be effectively exploited by the CMP.  ...  Fig. 11 shows a comparison of SPECjbb 2005 results between the Niagara-based SunFire T2000 and three IBM systems based on CMPs using more conventional superscalar POWER or x86 cores: the IBM p550, IBM  ... 
doi:10.1007/s10766-007-0033-6 fatcat:4gzhbtdumvablcjfy62osfb2g4

A Survey on Hardware and Software Support for Thread Level Parallelism [article]

Somnath Mazumdar, Roberto Giorgi
2016 arXiv   pre-print
Hardware support at execution time is very crucial to the performance of the system, thus different types of hardware support for threads also exist or have been proposed, primarily based on widely used  ...  Todays computers are built upon multiple processing cores and run applications consisting of a large number of threads, making runtime thread management a complex process.  ...  Adding multithreading support to reconfigurable architecture (mainly based on FPGAs) is complex due to its application based flexibility.  ... 
arXiv:1603.09274v3 fatcat:75isdvgp5zbhplocook6273sq4

History-aware, resource-based dynamic scheduling for heterogeneous multi-core processors

A.Z. Jooya, A. Baniasadi, M. Analoui
2011 IET Computers & Digital Techniques  
HARD relies on recording application resource utilization and throughput to adaptively change cores for applications during runtime.  ...  We compare HARD to a complexity-based static scheduler and show that HARD outperforms this alternative.  ...  In [10] Accelerated Critical Sections (ACS) is introduced which leverages the high-performance core(s) of an Asymmetric Chip Multiprocessor (ACMP) to accelerate the execution of critical sections.  ... 
doi:10.1049/iet-cdt.2009.0045 fatcat:h5yd2rnaone47gcuph6jwxhcmu

Scheduling Algorithms for Asymmetric Multi-core Processors [article]

Alan David
2017 arXiv   pre-print
Research advocates asymmetric multi-core processor system for better utilization of chip real state.  ...  This paper explores some representative algorithms of these classes to get an overview of scheduling algorithms for asymmetric multicore system.  ...  CAMP dynamically selects which utility threshold to use based on system workload. There are two pairs of utility thresholds used.  ... 
arXiv:1702.04028v1 fatcat:uwvezkeg5bazzaruyglkcilqyi

Bottleneck identification and scheduling in multithreaded applications

José A. Joao, M. Aater Suleman, Onur Mutlu, Yale N. Patt
2012 Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '12  
on an Asymmetric Chip Multi-Processor (ACMP).  ...  Performance of multithreaded applications is limited by a variety of bottlenecks, e.g. critical sections, barriers and slow pipeline stages.  ...  Acknowledgments We thank Eiman Ebrahimi, Veynu Narasiman, Santhosh Srinath, other members of the HPS research group, our shepherd Ras Bodik and the anonymous reviewers for their comments and suggestions  ... 
doi:10.1145/2150976.2151001 dblp:conf/asplos/JoaoSMP12 fatcat:iwf4i7vfy5gx7gddspyjjtun3u

Bottleneck identification and scheduling in multithreaded applications

José A. Joao, M. Aater Suleman, Onur Mutlu, Yale N. Patt
2012 SIGARCH Computer Architecture News  
on an Asymmetric Chip Multi-Processor (ACMP).  ...  Performance of multithreaded applications is limited by a variety of bottlenecks, e.g. critical sections, barriers and slow pipeline stages.  ...  Acknowledgments We thank Eiman Ebrahimi, Veynu Narasiman, Santhosh Srinath, other members of the HPS research group, our shepherd Ras Bodik and the anonymous reviewers for their comments and suggestions  ... 
doi:10.1145/2189750.2151001 fatcat:gdo2bg5wpvbxho53a5cg6mctte

Power challenges may end the multicore era

Hadi Esmaeilzadeh, Emily Blem, Renée St. Amant, Karthikeyan Sankaralingam, Doug Burger
2013 Communications of the ACM  
The low utility of this "dark silicon" may prevent both scaling to higher core counts and ultimately the economic viability of continued silicon scaling.  ...  Under these conditions, more cores are only possible if the cores are slower, simpler, or less utilized with each additional technology generation.  ...  Asymmetric multicore. The asymmetric multicore topology consists of one large monolithic core and many identical small cores.  ... 
doi:10.1145/2408776.2408797 fatcat:s3kiqdlmrjaopmrwifwdsz7tdm

Sharing the instruction cache among lean cores on an asymmetric CMP for HPC applications

Ugljesa Milic, Alejandro Rico, Paul Carpenter, Alex Ramirez
2017 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)  
Current supercomputing systems with heterogeneous or asymmetric CMPs (ACMP) combine few high-performance big cores for serial regions, together with many low-power lean cores for throughput computing.  ...  This paper analyzes the performance, power and area impact of such a design on an ACMP with one high-performance core and multiple low-power cores.  ...  We compare two symmetric CMPs, one with four big and one with 16 small cores, to an asymmetric CMP with one big and 12 small cores.  ... 
doi:10.1109/ispass.2017.7975265 dblp:conf/ispass/MilicRCR17 fatcat:ubmgox4ir5fczaiz4efhg62ctu

Enabling Network Security in HPC Systems Using Heterogeneous CMPs [chapter]

Ozcan Ozturk, Suleyman Tosun
2014 High-Performance Computing on Complex Environments  
However, homogeneous CMPs provide only one type of core to match various application requirements, thereby not fully utilizing the available chip area and power budget.  ...  More specifically, we propose an integer linear programming (ILP)-based methodology to mathematically analyze and provide heterogeneous CMP architectures and task High-Performance Computing on Complex  ...  ACKNOWLEDGMENTS This work was supported in part by Open European Network for High-Performance Computing on Complex Environments, the TUBITAK Grant 112E360, and a grant from Turk Telekom under Grant 3015  ... 
doi:10.1002/9781118711897.ch20 fatcat:apzx5plqi5cfhcfv5bise7fcbu

Study and evaluation of an Irregular Graph Algorithm on Multicore and GPU Processor Architectures [article]

Varun Nagpal
2016 arXiv   pre-print
One area of Computing applications which poses significant challenge of performance scalability on Chip Multiprocessors(CMP's) are Irregular applications.  ...  algorithm on relatively lower-cost Multicore and GPGPU based platforms.  ...  in a multithreaded application.  ... 
arXiv:1603.02655v1 fatcat:nklt3op66vdfdpmd3ygeckhwla

Leveraging Core Specialization via OS Scheduling to Improve Performance on Asymmetric Multicore Systems

Juan Carlos Saez, Alexandra Fedorova, David Koufaty, Manuel Prieto
2012 ACM Transactions on Computer Systems  
To demonstrate the effectiveness of CAMP on more realistic scenarios, we augmented the CAMP scheduler with a model that predicts the speedup factor on a real AMP prototype that closely matches future asymmetric  ...  To deliver this potential to unmodified applications, the OS scheduler must map threads to cores in consideration of the properties of both.  ...  As explained below, multithreaded applications will most often fall into the LOW utility class.  ... 
doi:10.1145/2166879.2166880 fatcat:vqn6pw2ugjaizaxdelnf7d4fme

Portable performance on asymmetric multicore processors

Ivan Jibaja, Ting Cao, Stephen M. Blackburn, Kathryn S. McKinley
2016 Proceedings of the 2016 International Symposium on Code Generation and Optimization - CGO 2016  
Applying these criteria effectively is challenging especially for complex and non-scalable multithreaded applications.  ...  Performance advantages are robust to a complex multithreaded adversary independently scheduled by the OS. WASH effectively identifies and optimizes a wider class of workloads than prior work.  ...  We rank threads based on their relative capacity to retire instructions, seeking to accelerate threads that dominate in terms of productive work (line 16).  ... 
doi:10.1145/2854038.2854047 dblp:conf/cgo/JibajaCBM16 fatcat:dgdx7swpu5eafhze3wufc5euwu
« Previous Showing results 1 — 15 out of 118 results