Filters








2,379 Hits in 10.1 sec

Rsim: simulating shared-memory multiprocessors with ILP processors

C.J. Hughes, V.S. Pai, P. Ranganathan, S.V. Adve
2002 Computer  
Modeling ILP features in a multiprocessor is particularly important for applications that exhibit parallelism among read misses.  ...  Rsim is a publicly available architecture simulator for shared-memory systems built from processors that aggressively exploit instruction-level parallelism.  ...  speculation in multiprocessor and uniprocessor simulations.  ... 
doi:10.1109/2.982915 fatcat:llmomku5z5cffaf22hadpdvlhy

DMP

Joseph Devietti, Brandon Lucia, Luis Ceze, Mark Oskin
2009 SIGARCH Computer Architecture News  
Current shared memory multicore and multiprocessor systems are nondeterministic.  ...  In this paper we make the case for fully deterministic shared memory multiprocessing (DMP). The behavior of an arbitrary multithreaded program on a DMP system is only a function of its inputs.  ...  Acknowledgments We thank the anonymous reviewers for their invaluable comments.  ... 
doi:10.1145/2528521.1508255 fatcat:i53avj67uvfg5dovopb6p7xjfi

DMP

Joseph Devietti, Brandon Lucia, Luis Ceze, Mark Oskin
2009 Proceeding of the 14th international conference on Architectural support for programming languages and operating systems - ASPLOS '09  
Current shared memory multicore and multiprocessor systems are nondeterministic.  ...  In this paper we make the case for fully deterministic shared memory multiprocessing (DMP). The behavior of an arbitrary multithreaded program on a DMP system is only a function of its inputs.  ...  Acknowledgments We thank the anonymous reviewers for their invaluable comments.  ... 
doi:10.1145/1508244.1508255 dblp:conf/asplos/DeviettiLCO09 fatcat:hm2rmz7qe5cbbepymg22fro3xe

DMP

Joseph Devietti, Brandon Lucia, Luis Ceze, Mark Oskin
2009 SIGPLAN notices  
Current shared memory multicore and multiprocessor systems are nondeterministic.  ...  In this paper we make the case for fully deterministic shared memory multiprocessing (DMP). The behavior of an arbitrary multithreaded program on a DMP system is only a function of its inputs.  ...  Acknowledgments We thank the anonymous reviewers for their invaluable comments.  ... 
doi:10.1145/1508284.1508255 fatcat:nzvvj2kxerc3vlph4typ5qq5hy

The Stanford Hydra CMP

L. Hammond, B.A. Hubbert, M. Siu, M.K. Prabhu, M. Chen, K. Olukolun
2000 IEEE Micro  
The Hydra chip multiprocessor (CMP) integrates four MIPS-based processors and their primary caches on a single chip together with a shared secondary cache.  ...  To simplify parallel programming, the Hydra CMP supports thread-level speculation and memory renaming, a paradigm that allows performance similar to a uniprocessor of comparable die area on integer programs  ...  Basem Nayfeh guided the early development of Hydra's memory system and thread-level speculation hardware.  ... 
doi:10.1109/40.848474 fatcat:hwou4dbdqfhi5clj6o23atuaka

Multiprocessors should support simple memory consistency models

M.D. Hill
1998 Computer  
I thank the following people, who may or may not agree with me, for their constructive comments: Sarita Adve  ...  Acknowledgments The ideas in this article crystallized through interactions with many people at Wisconsin and at Sun Microsystems during my 1995-1996 sabbatical, which Greg Papadopoulos graciously supported  ...  MEMORY CONSISTENCY MODELS The interface for memory in a shared memory multiprocessor is called a memory consistency model.  ... 
doi:10.1109/2.707614 fatcat:rwjgtncdqza4zlvvnhmmt2b3za

Efficient shared memory with minimal hardware support

Leonidas I. Kontothanassis, Michael L. Scott
1995 SIGARCH Computer Architecture News  
Shared memory is widely regarded as a more intuitive model than message passing for the development of parallel programs.  ...  to compare the performance of these protocols to that of full hardware coherence and distributed shared memory emulation.  ...  Figure 1: Normalized running time on CC-NUMA and NCC-NUMA hardware Figure 2 : 2 Normalized running time on DSM and NCC-NUMA hardware Figure 4 : 4 Hardware configuration of the Cashmere prototype parallel  ... 
doi:10.1145/218864.218870 fatcat:ukmwch45y5gkvlmqfnkunzd6zi

Multi-Threaded Processors [chapter]

David Padua, Amol Ghoting, John A. Gunnels, Mark S. Squillante, José Meseguer, James H. Cownie, Duncan Roweth, Sarita V. Adve, Hans J. Boehm, Sally A. McKee, Robert W. Wisniewski, George Karypis (+29 others)
2011 Encyclopedia of Parallel Computing  
The instruction-level parallelism found in a conventional instruction stream is limited. Studies have shown the limits of processor utilization even for today's superscalar microprocessors.  ...  In contrast, the multithreaded processor is able to pursue two or more threads of control in parallel within the processor pipeline.  ...  CHIP MULTIPROCESSORS Principal alternatives Today the most common organizational principles for multiprocessors are the symmetric multiprocessor (SMP), the distributed shared memory multiprocessor (DSM  ... 
doi:10.1007/978-0-387-09766-4_423 fatcat:heb3n2cfwnbi5nvxv5kvxd2xgm

Multithreaded Processors

T. Ungerer
2002 Computer journal  
The instruction-level parallelism found in a conventional instruction stream is limited. Studies have shown the limits of processor utilization even for today's superscalar microprocessors.  ...  In contrast, the multithreaded processor is able to pursue two or more threads of control in parallel within the processor pipeline.  ...  CHIP MULTIPROCESSORS Principal alternatives Today the most common organizational principles for multiprocessors are the symmetric multiprocessor (SMP), the distributed shared memory multiprocessor (DSM  ... 
doi:10.1093/comjnl/45.3.320 fatcat:hlkkabuhrzhkrmuyqomzfmc6zm

Emulating Transactional Memory on FPGA Multiprocessors [chapter]

Matteo Pusceddu, Simone Ceccolini, Antonino Tumeo, Gianluca Palermo, Donatella Sciuto
2011 Lecture Notes in Computer Science  
In this paper we discuss the development of two emulation platforms for transactional memory systems on a single Field Programmable Gate Array (FPGA).  ...  We introduce two systems, integrating only off-the-shelf components, that respectively use a centralized and a distributed approach, presenting their hardware and software design.  ...  Introduction Transactional memory [7] has emerged as a promising programming paradigm for shared memory multiprocessor architectures.  ... 
doi:10.1007/978-3-642-19137-4_7 fatcat:6flipy4ij5hqjf2dwbjfxpdmda

The Jrpm system for dynamically parallelizing Java programs

Michael K. Chen, Kunle Olukotun
2003 Proceedings of the 30th annual international symposium on Computer architecture - ISCA '03  
) are analyzed in real-time to identify the best loops to parallelize.  ...  CMPs have low sharing and communication costs relative to traditional multiprocessors, and thread-level speculation (TLS) simplifies program parallelization by allowing us to parallelize optimistically  ...  Tracer for Extracting Speculative Threads (TEST) [9] is hardware support in Jrpm that analyzes sequential program execution in real-time to find the best regions to parallelize.  ... 
doi:10.1145/859618.859668 fatcat:quf7z5ocfffjdnkxdhjyy7hvya

The Jrpm system for dynamically parallelizing Java programs

Michael K. Chen, Kunle Olukotun
2003 Proceedings of the 30th annual international symposium on Computer architecture - ISCA '03  
) are analyzed in real-time to identify the best loops to parallelize.  ...  CMPs have low sharing and communication costs relative to traditional multiprocessors, and thread-level speculation (TLS) simplifies program parallelization by allowing us to parallelize optimistically  ...  Tracer for Extracting Speculative Threads (TEST) [9] is hardware support in Jrpm that analyzes sequential program execution in real-time to find the best regions to parallelize.  ... 
doi:10.1145/859666.859668 fatcat:imo5rgynjfektfdplhdkh5l4oa

The Jrpm system for dynamically parallelizing Java programs

Michael K. Chen, Kunle Olukotun
2003 SIGARCH Computer Architecture News  
) are analyzed in real-time to identify the best loops to parallelize.  ...  CMPs have low sharing and communication costs relative to traditional multiprocessors, and thread-level speculation (TLS) simplifies program parallelization by allowing us to parallelize optimistically  ...  Tracer for Extracting Speculative Threads (TEST) [9] is hardware support in Jrpm that analyzes sequential program execution in real-time to find the best regions to parallelize.  ... 
doi:10.1145/871656.859668 fatcat:7paj5trthrfifaxkw3kuk2wquy

A Survey on Hardware and Software Support for Thread Level Parallelism [article]

Somnath Mazumdar, Roberto Giorgi
2016 arXiv   pre-print
We also review the programming models with respect to their support to shared-memory, distributed-memory and heterogeneity.  ...  Due to the heterogeneity in hardware, hybrid programming model (which combines the features of shared and distributed model) currently has become very promising.  ...  The aim of Cilk is to help programmers to build applications optimized for a maximum level of parallelism on shared-memory multiprocessors (SMPs).  ... 
arXiv:1603.09274v3 fatcat:75isdvgp5zbhplocook6273sq4

Trends in shared memory multiprocessing

P. Stenstrom, E. Hagersten, D.J. Lilja, M. Martonosi, M. Venugopal
1997 Computer  
The second step is to begin filling gaps in programming models and architectures for shared memory multiprocessing.  ...  Like uniprocessors, current shared memory multiprocessors are often built from high-performance microprocessors, so there is a clear transition path from uniprocessor to multiprocessor program implementations  ...  Acknowledgments We thank Yale Patt, who initiated the set of task forces that allowed us to develop our thoughts in a creative environment in Hawaii.  ... 
doi:10.1109/2.642814 fatcat:mhsgglxwfvdrtc4c4ap6eshxxa
« Previous Showing results 1 — 15 out of 2,379 results