Filters








994 Hits in 3.8 sec

Understanding the energy efficiency of simultaneous multithreading

Yingmin Li, David Brooks, Zhigang Hu, Kevin Skadron, Pradip Bose
2004 Proceedings of the 2004 international symposium on Low power electronics and design - ISLPED '04  
Thus, SMT can provide a substantial benefit for energyefficiency metrics such as ED 2 .  ...  In current microprocessor designs, power-efficiency is of critical importance, and we present modeling extensions to an architectural simulator to allow us to study the power-performance efficiency of  ...  PowerTimer allows us to understand the fundamental tradeoffs between power and performance in single and multithreaded modes of execution.  ... 
doi:10.1145/1013235.1013251 dblp:conf/islped/LiBHSB04 fatcat:5z52c4zc6rgilb6yhm4u5owebm

An Analysis of Microarchitecture Vulnerability to Soft Errors on Simultaneous Multithreaded Architectures

Wangyuan Zhang, Xin Fu, Tao Li, Jose Fortes
2007 2007 IEEE International Symposium on Performance Analysis of Systems & Software  
Simultaneous multithreaded (SMT) architectures exploit thread-level parallelism to improve overall processor throughput.  ...  Using a mixed set of SPEC CPU 2000 benchmarks, we quantify the impact of multithreading on a wide range of microarchitecture structures.  ...  In this paper, we provide an in-depth analysis of the impact of multithreading on processor vulnerability to transient faults.  ... 
doi:10.1109/ispass.2007.363747 dblp:conf/ispass/ZhangFLF07 fatcat:3qns4xdth5fdtl3xowy64gjevq

Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading

Jack L. Lo, Joel S. Emer, Henry M. Levy, Rebecca L. Stamm, Dean M. Tullsen, S. J. Eggers
1997 ACM Transactions on Computer Systems  
Wide-issue superscalar processors exploit ILP by executing multiple instructions from a single program in a single cycle.  ...  With insufficient TLP, processors in an MP will be idle; with insufficient ILP, multiple-issue hardware on a superscalar is wasted.  ...  This article explores parallel processing on a simultaneous multithreading architecture.  ... 
doi:10.1145/263326.263382 fatcat:urempgsyi5fmffbfxkr7s6zcju

Exploring the Capacity of a Modern SMT Architecture to Deliver High Scientific Application Performance [chapter]

Evangelia Athanasaki, Nikos Anastopoulos, Kornilios Kourtis, Nectarios Koziris
2006 Lecture Notes in Computer Science  
In this paper, we explore the performance limits by evaluating the tradeoffs between ILP and TLP for various kinds of instructions streams.  ...  Simultaneous multithreading (SMT) has been proposed to improve system throughput by overlapping instructions from multiple threads on a single wide-issue processor.  ...  Section 4 explores the performance limits and TLP-ILP tradeoffs, by considering a representative set of instruction streams.  ... 
doi:10.1007/11847366_19 fatcat:o64abdsbdbdszbzlmewiranvdu

Performance-reliability tradeoff analysis for multithreaded applications

I. Oz, H. R. Topcuoglu, M. Kandemir, O. Tosun
2012 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE)  
In general, there is a tradeoff between system reliability and performance of multithreaded applications running on multicore architectures.  ...  We measure the performance of these programs by counting execution clock cycles, while the system reliability is measured by Thread Vulnerability Factor (TVF) which is a recentlyproposed metric.  ...  In a recent work, Thread Vulnerability Factor (TVF) has been proposed as a reliability metric for multithreaded applications on CMP architectures [10] .  ... 
doi:10.1109/date.2012.6176624 dblp:conf/date/OzTKT12 fatcat:767m4dqbujd4noo7ufsofxqooa

PTSMT: A Tool for Cross-Level Power, Performance, and Thermal Exploration of SMT Processors

Deepa Kannan, Aseem Gupta, Aviral Shrivastava, Nikil D. Dutt, Fadi J. Kurdahi
2008 21st International Conference on VLSI Design (VLSID 2008)  
While several performance simulation tools to explore the performance aspect of SMT processors early in their design phase exist, there is a lack of early power and performance evaluation tools for SMT  ...  To this end, we have developed PTSMT: a tightly coupled power, performance and thermal exploration tool for SMT processors.  ...  Conclusion There has been extensive research on analyzing the performance characteristics of SMT processors and evaluating their tradeoffs.  ... 
doi:10.1109/vlsi.2008.84 dblp:conf/vlsid/KannanGSDK08 fatcat:2ig3yoe6rzhvnfjqnuj36p3do4

Looking back on the language and hardware revolutions

Hadi Esmaeilzadeh, Ting Cao, Yang Xi, Stephen M. Blackburn, Kathryn S. McKinley
2011 SIGPLAN notices  
We measure representative Intel IA32 processors with technologies ranging from 130nm to 32nm while they execute sequential and parallel benchmarks written in native and managed languages.  ...  (II) Architecture: Clock scaling, microarchitecture, simultaneous multithreading, and chip multiprocessors each elicit a huge variety of power, performance, and energy responses.  ...  Acknowledgements A number of people have generously provided us with assistance. We thank Bob Edwards at ANU for helping fabricate and calibrate the current sensors.  ... 
doi:10.1145/1961296.1950402 fatcat:5nnlthdvfzbzvk4nebm7o2wcty

Looking back on the language and hardware revolutions

Hadi Esmaeilzadeh, Ting Cao, Yang Xi, Stephen M. Blackburn, Kathryn S. McKinley
2011 Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems - ASPLOS '11  
We measure representative Intel IA32 processors with technologies ranging from 130nm to 32nm while they execute sequential and parallel benchmarks written in native and managed languages.  ...  (II) Architecture: Clock scaling, microarchitecture, simultaneous multithreading, and chip multiprocessors each elicit a huge variety of power, performance, and energy responses.  ...  Acknowledgements A number of people have generously provided us with assistance. We thank Bob Edwards at ANU for helping fabricate and calibrate the current sensors.  ... 
doi:10.1145/1950365.1950402 dblp:conf/asplos/EsmaeilzadehCXBM11 fatcat:osbwh2difjh3lkyo5g7e4ibtrm

Looking back on the language and hardware revolutions

Hadi Esmaeilzadeh, Ting Cao, Yang Xi, Stephen M. Blackburn, Kathryn S. McKinley
2012 SIGPLAN notices  
We measure representative Intel IA32 processors with technologies ranging from 130nm to 32nm while they execute sequential and parallel benchmarks written in native and managed languages.  ...  (II) Architecture: Clock scaling, microarchitecture, simultaneous multithreading, and chip multiprocessors each elicit a huge variety of power, performance, and energy responses.  ...  Acknowledgements A number of people have generously provided us with assistance. We thank Bob Edwards at ANU for helping fabricate and calibrate the current sensors.  ... 
doi:10.1145/2248487.1950402 fatcat:wjsurh5gsrdmfh5cgx6xtpysdi

Looking back on the language and hardware revolutions

Hadi Esmaeilzadeh, Ting Cao, Yang Xi, Stephen M. Blackburn, Kathryn S. McKinley
2011 SIGARCH Computer Architecture News  
We measure representative Intel IA32 processors with technologies ranging from 130nm to 32nm while they execute sequential and parallel benchmarks written in native and managed languages.  ...  (II) Architecture: Clock scaling, microarchitecture, simultaneous multithreading, and chip multiprocessors each elicit a huge variety of power, performance, and energy responses.  ...  Acknowledgements A number of people have generously provided us with assistance. We thank Bob Edwards at ANU for helping fabricate and calibrate the current sensors.  ... 
doi:10.1145/1961295.1950402 fatcat:wtvcqdnbkfaqzeldp5gfyd5hte

Area-efficiency in CMP core design

Omid Azizi, Aqeel Mahesri, Sanjay J. Patel, Mark Horowitz
2009 SIGARCH Computer Architecture News  
As a case study, we apply this methodology to explore the performance-area tradeoffs in a highly parallel accelerator architecture for visual computing applications.  ...  In this paper, we examine the area-performance design space of a processing core for a chip multiprocessor (CMP), considering both the architectural design space and the tradeoffs of the physical design  ...  Again, the memory bandwidth, while achievable in a real processor, is overprovisioned so as not to be a bottleneck.  ... 
doi:10.1145/1577129.1577138 fatcat:bpp3llavirhf3oe5the7r6qcji

Optimizing Issue Queue Reliability to Soft Errors on Simultaneous Multithreaded Architectures

Xin Fu, Wangyuan Zhang, Tao Li, José Fortes
2008 2008 37th International Conference on Parallel Processing  
The issue queue (IQ) is a key microarchitecture structure for exploiting instruction-level and thread-level parallelism in dynamically scheduled simultaneous multithreaded (SMT) processors.  ...  In this paper, we explore microarchitecture techniques to optimize IQ reliability to soft error on SMT architectures.  ...  In [25] , Brooks et al. explored the tradeoffs between several mechanisms for responding to periods of thermal trauma.  ... 
doi:10.1109/icpp.2008.23 dblp:conf/icpp/FuZLF08 fatcat:zk27gnhemrbcdglrcw2xdfbghu

Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance

Rakesh Kumar, Dean M. Tullsen, Parthasarathy Ranganathan, Norman P. Jouppi, Keith I. Farkas
2004 SIGARCH Computer Architecture News  
This paper demonstrates that this architecture can provide significantly higher performance in the same area than a conventional chip multiprocessor.  ...  It examines policies for heterogeneous architectures both with and without multithreading cores.  ...  This research was funded in part by NSF grant CCR-0105743 and a grant from Intel Corporation.  ... 
doi:10.1145/1028176.1006707 fatcat:rzncce5rfrfevanucuc5uwkulu

Phase guided sampling for efficient parallel application simulation

Jeffrey Namkung, Dohyung Kim, Rajesh Gupta, Igor Kozintsev, Jean-Yves Bouget, Carole Dulong
2006 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis - CODES+ISSS '06  
This cost function provides a convenient control knob for exploiting tradeoffs between simulation speed and accuracy.  ...  Our experimental results show that in most cases, properly setting the cost function's threshold can yield a reduction in sampling by 90%, while maintaining error to less than 5%.  ...  While sampling techniques have been heavily explored for uni-processor/single-threaded benchmarks, only a few recent works have shifted the target platform/application to multi-processor/multithreaded  ... 
doi:10.1145/1176254.1176301 dblp:conf/codes/NamkungKGKBD06 fatcat:ml2tbsj2rrh5thpnnhfchad5lm

McPAT

Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, Norman P. Jouppi
2009 Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture - Micro-42  
Combined with a performance simulator, McPAT enables architects to consistently quantify the cost of new ideas and assess tradeoffs of different architectures using new metrics like energy-delay-area 2  ...  At the microarchitectural level, McPAT includes models for the fundamental components of a chip multiprocessor, including in-order and out-of-order processor cores, networks-on-chip, shared caches, integrated  ...  Taken together, this integrated and hierarchical approach enables the user to paint a comprehensive picture of a design space, exploring tradeoffs between design and technology choices in terms of power  ... 
doi:10.1145/1669112.1669172 dblp:conf/micro/LiASBTJ09 fatcat:grtv5brsxzgwxdiqjcdhkfkqwa
« Previous Showing results 1 — 15 out of 994 results