Filters








16,810 Hits in 6.6 sec

Maximizing multiprocessor performance with the SUIF compiler

M.W. Hall, J.M. Anderson, S.P. Amarasinghe, B.R. Murphy, Shih-Wei Liao, E. Bugnion, M.S Lam
1996 Computer  
Using a vector architecture effectively involves parallelizing repeated arithmetic operations on large data streams-for example, the innermost loops in array-oriented programs.  ...  To use a multiprocessor effectively, the compiler must exploit coarse-grain parallelism, locating large computations that can execute independently in parallel.  ...  Acknowledgments This research was supported in part by the Air Force Materiel Command and ARPA contracts F30602-95-C-0098, DABT63-95-C-0118, and DABT63-94-C-0054; a Digital Equipment Corporation grant;  ... 
doi:10.1109/2.546613 fatcat:6x7urb56urbrho5ycgavfdxwte

Compilers for instruction-level parallelism

M. Schlansker, T.M. Conte, J. Dehnert, K. Ebcioglu, J.Z. Fang, C.L. Thompson
1997 Computer  
Compilers use global knowledge of the application program not readily available to a hardware interpreter as well as a description of the target machine architecture to guide the machine-specific optimizations  ...  I nstruction-level parallelism allows a sequence of instructions derived from a sequential program to be parallelized for execution on multiple pipelined functional units.  ...  Compilers use global knowledge of the application program not readily available to a hardware interpreter as well as a description of the target machine architecture to guide the machine-specific optimizations  ... 
doi:10.1109/2.642817 fatcat:sqa3irdg3zcqzftmok3rpsv65a

Loop optimizations in C and C++ compilers: an overview

Réka Kovács, Zoltán Porkoláb
2020 Annales Mathematicae et Informaticae  
In this paper, we give an overview of the scientific literature on loop optimization technology, and summarize the status of current implementations in the most widely used C and C++ compilers in the industry  ...  Therefore we increasingly rely on compilers to do the heavy-lifting for us. A significant part of optimizations done by compilers are loop optimizations.  ...  The publication of this paper is supported by the European Union, co-financed by the European Social Fund (EFOP-3.6.3-VEKOP-16-2017-00002).  ... 
doi:10.33039/ami.2020.07.003 fatcat:wnuup2mcbffotbcrl3cv6y2mhu

MATLAB Parallelization through Scalarization

Chun-Yu Shei, Adarsh Yoga, Madhav Ramesh, Arun Chauhan
2011 2011 15th Workshop on Interaction between Compilers and Computer Architectures  
We have implemented this strategy in a MATLAB compiler that compiles portions of MATLAB to C++ or CUDA C.  ...  Additional array temporaries are obviated in the case of array subscripts.  ...  The notion of optimizing a program in steps, as more information becomes available, has been used in a version of ML, called MetaML, where it is called multi-staging [24] .  ... 
doi:10.1109/interact.2011.18 dblp:conf/IEEEinteract/SheiYRC11 fatcat:hmnywoccd5cjxawmkb6cw6hq7u

SUIF

Robert P. Wilson, Monica S. Lam, John L. Hennessy, Robert S. French, Christopher S. Wilson, Saman P. Amarasinghe, Jennifer M. Anderson, Steve W. K. Tjiang, Shih-Wei Liao, Chau-Wen Tseng, Mary W. Hall
1994 SIGPLAN notices  
The toolkit currently includes C and Fortran front ends, a loop-level parallelism and locality optimizer, an optimizing MIPS back end, a set of compiler development tools, and support for instructional  ...  SUIF consists of a small, clearly documented kernel and a toolkit of compiler passes built on top of the kernel.  ...  out bugs in the compiler, Karen Pieper for her help on the original SUIF system design, Mike Smith for his MIPS code generator, Todd Smith for his work on the translators between C and SUIF, and Michael  ... 
doi:10.1145/193209.193217 fatcat:yleymrlwuzfhzc2odlec6mv7si

Improving register allocation for subscripted variables

David Callahan, Steve Carr, Ken Kennedy
1990 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation - PLDI '90  
His accomplishments made him a pioneer not only of computer architecture but also in compiler optimization.  ...  ACKNOWLEDGMENTS The authors would like to express our sincerest gratitude to the late John Cocke, who inspired and funded this work while at IBM.  ...  This approach and its descendants have led to substantive, and in some cases dramatic, improvements in the performance of scientific programs on machines with long memory latencies.  ... 
doi:10.1145/93542.93553 dblp:conf/pldi/CallahanCK90 fatcat:w4wsph4nobddzbwp7grste6znm

ispc: A SPMD compiler for high-performance CPU programming

Matt Pharr, William R. Mark
2012 2012 Innovative Parallel Computing (InPar)  
Existing CPU parallel programming models focus primarily on multi-core parallelism, neglecting the substantial computational capabilities that are available in CPU SIMD vector units.  ...  We have developed a compiler, the Intel R SPMD Program Compiler (ispc), that delivers very high performance on CPUs thanks to effective use of both multiple processor cores and SIMD vector units. ispc  ...  Tim suggested the "SPMD on SIMD" terminology and has extensively argued for the advantages of the SPMD model.  ... 
doi:10.1109/inpar.2012.6339601 fatcat:nfbplb43jvgmjgxxbh27wftcn4

Exploiting Implicit Parallelism in Dynamic Array Programming Languages

Shams Imam, Vivek Sarkar, David Leibs, Peter B. Kessler
2014 Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming - ARRAY'14  
The J programming language includes the usual idioms of operations on arrays of the same size and shape, where the operations can often be performed in parallel for each individual item of the operands  ...  The interpreter itself is responsible for exploiting the parallelism available in the applications.  ...  In this section, we describe some of our optimizations that enable us to execute J programs on our interpreter efficiently in terms of execution time performance.  ... 
doi:10.1145/2627373.2627374 dblp:conf/pldi/ImamSLK14 fatcat:2lvchvw5xjhnvnuiw6imjopomy

Improving Memory Subsystem Performance Using ViVA: Virtual Vector Architecture [chapter]

Joseph Gebis, Leonid Oliker, John Shalf, Samuel Williams, Katherine Yelick
2009 Lecture Notes in Computer Science  
the programmer and compiler with an unfamiliar and difficult programming model.  ...  In this work, we present the Virtual Vector Architecture (ViVA), which combines the memory semantics of vector computers with a software-controlled scratchpad memory in order to provide a more effective  ...  Acknowledgments All authors from LBNL were supported by the Office of Advanced Scientific Computing Research in the DOE Office of Science under contract number DE-AC02-05CH11231.  ... 
doi:10.1007/978-3-642-00454-4_16 fatcat:kta3jonbpfh7ba47yrre6xaguq

FFT Compiler Techniques [chapter]

Stefan Kral, Franz Franchetti, Juergen Lorenz, Christoph W. Ueberhuber, Peter Wurzinger
2004 Lecture Notes in Computer Science  
The floatingpoint performance of Fftw's scalar version has been more than doubled, resulting in the fastest FFT implementation to date.  ...  , and IBM's SIMD operations implemented on the new processors of the BlueGene/L supercomputer. The paper introduces a special compiler backend for Intel P4's SSE 2 and AMD's 3DNow!  ...  We would like to thank Matteo Frigo and Steven Johnson for many years of prospering cooperation and for making it possible for us to access non-public versions of Fftw.  ... 
doi:10.1007/978-3-540-24723-4_15 fatcat:iftd7t3c6rgengdgu2q6xjsxe4

Improving register allocation for subscripted variables

David Callahan, Steve Carr, Ken Kennedy
1990 SIGPLAN notices  
His accomplishments made him a pioneer not only of computer architecture but also in compiler optimization.  ...  ACKNOWLEDGMENTS The authors would like to express our sincerest gratitude to the late John Cocke, who inspired and funded this work while at IBM.  ...  This approach and its descendants have led to substantive, and in some cases dramatic, improvements in the performance of scientific programs on machines with long memory latencies.  ... 
doi:10.1145/93548.93553 fatcat:ikzj44grtvb2xp5bzcvxiz2twe

Improving register allocation for subscripted variables

David Callahan, Steve Carr, Ken Kennedy
2004 SIGPLAN notices  
His accomplishments made him a pioneer not only of computer architecture but also in compiler optimization.  ...  ACKNOWLEDGMENTS The authors would like to express our sincerest gratitude to the late John Cocke, who inspired and funded this work while at IBM.  ...  This approach and its descendants have led to substantive, and in some cases dramatic, improvements in the performance of scientific programs on machines with long memory latencies.  ... 
doi:10.1145/989393.989428 fatcat:okbavwnrsjcilm3siuatbpk5s4

Automatic and Interactive Program Parallelization Using the Cetus Source to Source Compiler Infrastructure v2.0

Akshay Bhosale, Parinaz Barakhshan, Miguel Romero Rosas, Rudolf Eigenmann
2022 Electronics  
Cetus is used for research on compiler optimizations for multi-cores with an emphasis on automatic parallelization.  ...  The compiler has gone through several iterations of benchmark studies and implementations of those techniques that could improve the parallel performance of these programs.  ...  The effect of the various optimization techniques in Cetus on the performance of prior generations of benchmarks and machines is well documented [2, 3] .  ... 
doi:10.3390/electronics11050809 fatcat:f3djr5rigne3jp7a63tjh26tsi

Automatic Parallelization of Array-oriented Programs for a Multi-core Machine

Wai-Mee Ching, Da Zheng
2012 International journal of parallel programming  
We present the work on automatic parallelization of array-oriented programs for multi-core machines.  ...  Source programs written in standard APL are translated by a parallelizing APL-to-C compiler into parallelized C code, i.e. C mixed with OpenMP directives.  ...  Alex Katz for his help in editing the manuscript to improve its English. We also thank referees for their help in improving this paper.  ... 
doi:10.1007/s10766-012-0197-6 fatcat:arjv5k7snngotbcqycehfay65m

An OpenMP Compiler Benchmark

Matthias S. Müller
2003 Scientific Programming  
Six out of seven proposed optimization techniques are already implemented in different compilers. However, most compilers implement only one or two of them.  ...  The purpose of this benchmark is to propose several optimization techniques and to test their existence in current OpenMP compilers.  ...  Since the goal of parallel programming is to achieve higher performance, the further acceptance of OpenMP will strongly depend on compiler optimization techniques especially in the field where OpenMP has  ... 
doi:10.1155/2003/287461 fatcat:lwygisbdszcmjbef2r35hmgnne
« Previous Showing results 1 — 15 out of 16,810 results