51 Hits in 1.6 sec

Exploiting Postdominance for Speculative Parallelization

Mayank Agarwal, Kshitiz Malik, Kevin M. Woley, Sam S. Stone, Matthew I. Frank
2007 2007 IEEE 13th International Symposium on High Performance Computer Architecture  
The specific contributions of this paper include, first, a description of task selection based on immediate postdominance for a system that speculatively creates tasks.  ...  Task-selection policies are critical to the performance of any architecture that uses speculation to extract parallel tasks from a sequential thread.  ...  Acknowledgments We are grateful to several people for helping to make this paper possible. Sanjay Patel was an enthusiastic contributor to the early formulation of this work.  ... 
doi:10.1109/hpca.2007.346207 dblp:conf/hpca/AgarwalMWSF07 fatcat:hthp2iczz5dhrpcyiw2syrzxza

A general compiler framework for speculative multithreaded processors

A. Bhowmik, M. Franklin
2004 IEEE Transactions on Parallel and Distributed Systems  
Speculative multithreading (SpMT) promises to be an effective mechanism for parallelizing nonnumeric programs, which tend to have irregular and pointer-intensive data structures and complex flows of control  ...  This paper presents a compiler framework for partitioning a sequential program into multiple threads for parallel execution in an SpMT system.  ...  As we will see later, speculative threads are a must for exploiting thread-level parallelism (TLP) from many nonnumeric programs.  ... 
doi:10.1109/tpds.2004.26 fatcat:hip6liwcxzgxjcxs6v56pp7sbq

Using speculative computation and parallelizing techniques to improve scheduling of control based designs

Roberto Cordone, Fabrizio Ferrandi, Marco D. Santambrogio, Gianluca Palermo, Donatella Sciuto
2006 Proceedings of the 2006 conference on Asia South Pacific design automation - ASP-DAC '06  
Recent research results have seen the application of parallelizing techniques to high-level synthesis.  ...  In particular, the effect of speculative code transformations on mixed control-data flow designs has demonstrated effective results on schedule lengths.  ...  HTGs have been defined as intermediate parallel program representations that encapsulate minimal data and control dependences, and can be used to extract and exploit functional and task-level parallelism  ... 
doi:10.1145/1118299.1118502 fatcat:g62qversxvbkjkwbn26asuao4e

Limits of Thread-Level Parallelism in Non-numerical Programs

Akio Nakajima, Ryotaro Kobayashi, Hideki Ando, Toshio Shimada
2006 IPSJ Digital Courier  
Chip multiprocessors (CMPs), which recently became available with the advance of LSI technology, can outperform current superscalar processors by exploiting thread-level parallelism (TLP).  ...  We focus particularly on three techniques: thread partitioning with various control structure levels, speculative thread execution, and speculative register communication.  ...  A CMP exploits thread-level parallelism (TLP) in addition to instruction-level parallelism (ILP) by executing multiple threads in parallel.  ... 
doi:10.2197/ipsjdc.2.280 fatcat:52xvplsbvfah3a4lnmyu7tzezi


Keshav Pingali, Gianfranco Bilardi
1995 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation - PLDI '95  
This relation is usually represented using the control dependence graph; unfortunately, the size of this data structure can be quadratic in the size of the program, even for some structured programs.  ...  In this paper, we introduce a data structure called the augmented postdominator tree (A'PT) which is constructed in space and time proportional to the size of the program, and which can answer control  ...  These sets are used in scheduling instructions across basic block boundaries for speculative or predicated execution [Fis81, BR9 1].  ... 
doi:10.1145/207110.207114 dblp:conf/pldi/PingaliB95 fatcat:j6qtuv4u3fchlhmlof25y72wge


Keshav Pingali, Gianfranco Bilardi
1995 SIGPLAN notices  
This relation is usually represented using the control dependence graph; unfortunately, the size of this data structure can be quadratic in the size of the program, even for some structured programs.  ...  In this paper, we introduce a data structure called the augmented postdominator tree (A'PT) which is constructed in space and time proportional to the size of the program, and which can answer control  ...  These sets are used in scheduling instructions across basic block boundaries for speculative or predicated execution [Fis81, BR9 1].  ... 
doi:10.1145/223428.207114 fatcat:qvmrztzokvgstd5x7cutaj4bum

Optimal control dependence computation and the Roman chariots problem

Keshav Pingali, Gianfranco Bilardi
1997 ACM Transactions on Programming Languages and Systems  
The usual representation of this relation is the control dependence graph (CDG), but the size of the CDG can grow quadratically with the input program, even for structured programs.  ...  In this article, we introduce the augmented postdominator tree (APT ), a data structure which can be constructed in space and time proportional to the size of the program and which supports enumeration  ...  We thank the referees for their constructive comments.  ... 
doi:10.1145/256167.256217 fatcat:2jiowuxkeffktfudc2pgkugkqi

Recovery code generation for general speculative optimizations

Jin Lin, Wei-Chung Hsu, Pen-Chung Yew, Roy Dz-Ching Ju, Tin-Fook Ngai
2006 ACM Transactions on Architecture and Code Optimization (TACO)  
It also allows multilevel speculation for multilevel pointers and multilevel expression trees to be handled with no additional complexity.  ...  This paper proposes a framework that uses an if-block structure to facilitate check instructions and recovery code generation for general speculative optimizations.  ...  ACKNOWLEDGMENTS The authors wish to thank Sun Chan (Intel), Peng Tu (Intel), Raymond Lo for their valuable suggestions and comments. This work was supported in part by the U.S.  ... 
doi:10.1145/1132462.1132466 fatcat:22fy2p7drjdyjfr7kspoueia44

Dynamic parallelization and mapping of binary executables on hierarchical platforms

Efe Yardimci, Michael Franz
2006 Proceedings of the 3rd conference on Computing frontiers - CF '06  
Leveraging observations made at runtime, a thin software layer recompiles executing code compiled for a uniprocessor and generates parallelized and/or vectorized code segments that exploit available parallel  ...  Among the techniques employed are control speculation, loop distribution across several threads, and automatic parallelization of recursive routines.  ...  Special Case: Recursive Loops In order to be able to exploit parallelism found in recursive loop iterations, we create a special type of superblock just for recursive procedures.  ... 
doi:10.1145/1128022.1128040 dblp:conf/cf/YardimciF06 fatcat:l4p6ckzpefhjvae7bx6boeiwje

P-slice based efficient speculative multithreading

Rakesh Ranjan, Pedro Marcuello, Fernando Latorre, Antonio Gonzalez
2009 2009 International Conference on High Performance Computing (HiPC)  
This technique does not introduce additional traffic in the bus and improves the performance of a conventional SpMT memory model by 6% on average and up to 21% for some applications.  ...  Speculative multithreading (SpMT) has been proposed in the past to boost performance of irregular applications in multi-core environments.  ...  Therefore, processors that are able to exploit speculative thread level parallelism include support for storing the speculative state until validation.  ... 
doi:10.1109/hipc.2009.5433216 dblp:conf/hipc/RanjanMLG09 fatcat:2wshaep3mrbxllprug4s3dhlfu

Qinling: A Parametric Model in Speculative Multithreading

Yuxiang Li, Yinliang Zhao, Bin Liu
2017 Symmetry  
Speculative multithreading (SpMT) is a thread-level automatic parallelization technique that can accelerate sequential programs, especially for irregular applications that are hard to be parallelized by  ...  Experiments show that Qinling delivers a good performance to predict speedups of unseen programs, and provides feedback guidance for Prophet to obtain the optimal partition parameters.  ...  Acknowledgments: We thank our colleagues for their collaboration and the present work. We also thank all the reviewers for their specific comments and suggestions.  ... 
doi:10.3390/sym9090180 fatcat:hz5343sctzcydcmkmtsmg2vmru

Toward a more accurate understanding of the limits of the TLS execution paradigm

Nikolas Ioannou, Jeremy Singer, Salman Khan, Polychronis Xekalakis, Paraskevas Yiapanis, Adam Pococ, Gavin Brown, Mikel Lujan, Ian Watson, Marcelo Cintra
2010 IEEE International Symposium on Workload Characterization (IISWC'10)  
Thread-Level Speculation (TLS) facilitates the extraction of parallel threads from sequential applications.  ...  out-of-order task spawn or support for intermediate checkpointing.  ...  Out-of-Order Loop Speculation In an effort to better exploit loop level parallelism, we also evaluate simultaneously speculating on multiple levels of the same loop nest.  ... 
doi:10.1109/iiswc.2010.5649169 dblp:conf/iiswc/IoannouSKXYPBLWC10 fatcat:je3jgv43erb4fia2toybgn3dda

The Intel IA-64 compiler code generator

J. Bharadwaj, W.Y. Chen, W. Chuang, G. Hoflehner, K. Menezes, K. Muthukumar, J. Pierce
2000 IEEE Micro  
In planning the new EPIC (Explicitly Parallel Instruction Computing) architecture, Intel designers wanted to exploit the high level of instruction-level parallelism (ILP) found in application code.  ...  The ECG contains two schedulers: the software pipeliner for targeted cyclic regions and the global code scheduler for all remaining regions. Both schedulers make use of control and data speculation.  ...  Acknowledgments We acknowledge Kent Fielden, Dong-Yuan Chen, Youfeng Wu, Roland Kenner, and Chris McKinsey for their previous work in the design and implementation of parts of the ECG.  ... 
doi:10.1109/40.877949 fatcat:vqwufz7bl5a3jka7qkc5uzo6sq

Software speculative multithreading for Java

Christopher J. F. Pickett
2007 Companion to the 22nd ACM SIGPLAN conference on Object oriented programming systems and applications companion - OOPSLA '07  
Automatic parallelization is a compiler and/or runtime optimization that allows single-threaded programs to exploit multiple processors without additional programmer effort.  ...  and functional programs are good candidates for parallelization [33] .  ...  exploits its software implementation context. • Support for nested method level speculation, both in-order and out-of-order.  ... 
doi:10.1145/1297846.1297950 dblp:conf/oopsla/Pickett07 fatcat:6zydjc6hwbflvfstmm574e2nam

A macrotask-level unlimited speculative execution on multiprocessors

Hayato Yamana, Mitsuhisa Sato, Yuetsu Kodama, Hirofumi Sakane, Shuichi Sakai, Yoshinori Yamaguchi
1995 Proceedings of the 9th international conference on Supercomputing - ICS '95  
a') The speculation both inter and intra processors is performed to exploit full parallelism in a program.  ...  The result is shown in Conclusions In this paper, we propose the unlimited speculative execution that enables the speculation inter processors to exploit rich parallelism.  ... 
doi:10.1145/224538.224620 dblp:conf/ics/YamanaSKSSY95 fatcat:zlkhkizl6feu3dx7wnzamaunfe
« Previous Showing results 1 — 15 out of 51 results