Filters








3,049 Hits in 4.1 sec

Advanced Stencil-Code Engineering (Dagstuhl Seminar 15161)

Christian Lengauer, Matthias Bolten, Robert D. Falgout, Olaf Schenk, Marc Herbstritt
2015 Dagstuhl Reports  
This report documents the program and the outcomes of Dagstuhl Seminar 15161 "Advanced Stencil-Code Engineering".  ...  Its aim was to lay the basis for a new interdisciplinary research community on high-performance stencil codes.  ...  via product-line technology and domain engineering 58 15161 -Advanced Stencil-Code Engineering the use of powerful models for program optimization, like the polyhedron model for loop parallelization and  ... 
doi:10.4230/dagrep.5.4.56 dblp:journals/dagstuhl-reports/LengauerBFS15 fatcat:suk5zayvlnb63ozlamrw2vk2w4

Persistent Asynchronous Adaptive Specialization for Generic Array Programming

Clemens Grelck, Heinrich Wiesinger
2018 International journal of parallel programming  
A number of critical issues that, among others, stem from the interplay between function specialization and function overloading catch our special attention.  ...  We describe the solutions adopted and illustrate the benefits of persistent asynchronous adaptive specialization by a series of experiments.  ...  Apart from the obvious reason that generic code maintains more information in runtime data structures, the crucial issue are the SaC compiler's advanced optimizations [3] that are not as effective on  ... 
doi:10.1007/s10766-018-0567-9 fatcat:csjgago3qff4pp56cu6bp7p3sa

Automatic generation of specialized direct convolutions for mobile GPUs

Naums Mogers, Valentin Radu, Lu Li, Jack Turner, Michael O'Boyle, Christophe Dubach
2020 Proceedings of the 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit  
Using Lift, we show that it is possible to generate automatically code that is ×10 faster than the direct convolution while using ×3.6 less space than the GEMM-based convolution of the very specialized  ...  However, writing optimized parallel code for GPUs is far from trivial.  ...  Acknowledgments This work was supported by the Engineering and Physical Sciences Research Council (grant EP/L01503X/1), EPSRC Centre for Doctoral Training in Pervasive Parallelism at the University of  ... 
doi:10.1145/3366428.3380771 dblp:conf/ppopp/MogersRLTOD20 fatcat:342savoeijb3zaznujfmhptoku

Analyzing Behavior Specialized Acceleration

Tony Nowatzki, Karthikeyan Sankaralingam
2016 ACM SIGOPS Operating Systems Review  
Of significant interest are Behavioral Specialized Accelerators (BSAs), which are designed to efficiently execute code with only certain properties, but remain largely configurable or programmable.  ...  Hardware specialization has become a promising paradigm for overcoming the inefficiencies of general purpose microprocessors.  ...  Conservation Cores are automatically generated, simple hardware implementations of application code [53] , which are meant as offload engines for in-order cores.  ... 
doi:10.1145/2954680.2872412 fatcat:66uy7l3ggbh6ze2mp33wtgmbtm

Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores [article]

Fabian Schuiki, Florian Zaruba, Torsten Hoefler, Luca Benini
2020 arXiv   pre-print
Single-issue processor cores are very energy efficient but suffer from the von Neumann bottleneck, in that they must explicitly fetch and issue the loads/storse necessary to feed their ALU/FPU.  ...  Sequential code runs 3x faster on a single core, and 3x fewer cores are needed in a cluster to achieve the same performance.  ...  Existing software would be able to make use of our architectural extension via simple recompilation without the need for re-engineering work.  ... 
arXiv:1911.08356v3 fatcat:3yzaep7h3femvc7geusyny3mwa

Full Issue

Harmony Bench
2014 The International Journal of Screendance  
Penelope Broadbent, "Review: Mortal Engine," Australian Stage, March 8, 2009, http://www.australianstage.com.au/reviews/melbourne/mortal-engine--chunkymove-2296.html 17. Ibid. 18.  ...  Hence the face can only be meaningful through the semiotic coding of the facial machine.  ...  See also her essay in this issue. 10 .  ... 
doi:10.18061/ijsd.v4i0.4561 fatcat:wc6etmacwrfnbeniz47d3gayjm

Kernel Specialization for Improved Adaptability and Performance on Graphics Processing Units (GPUs)

Nicholas Moore, Miriam Leeser, Laurie Smith King
2013 2013 IEEE 27th International Symposium on Parallel and Distributed Processing  
These factors limit code reuse and the applicability of GPU computing to a wider variety of problems.  ...  As a result, many GPU codes offer minimum levels of adaptability to variations among problem instances and hardware configurations.  ...  [10] are able to use the JIT compilation features of Ruby and Python (based on PyCUDA) to transform concise embedded stencil representations into GPU code.  ... 
doi:10.1109/ipdps.2013.31 dblp:conf/ipps/MooreLK13 fatcat:5wxiwkpk2zdbte2rjwmqspp7um

Bridging the Gap Between General-Purpose and Domain-Specific Compilers with Synthesis

Alvin Cheung, Shoaib Kamil, Armando Solar-Lezama, Marc Herbstritt
2015 Summit on Advances in Programming Languages  
By leveraging general synthesis technology, it is possible to have a generic kernel translator that can be specialized by compiler developers for each domainspecific compiler, making it easy to build new  ...  Each kernel translator is associated with a domain-specific compiler, and the role of each kernel translator is to scan the input code in search of code fragments that can be optimized by the domain-specific  ...  Introduction Despite significant advances in compiler optimization over the last thirty years, the promise of actual optimality in generated code remains elusive; even today, clean and high-level code  ... 
doi:10.4230/lipics.snapl.2015.51 dblp:conf/snapl/CheungKS15 fatcat:lqcelmyyfrcmhatjowolqgdihi

Architecture and performance of Devito, a system for automated stencil computation [article]

Fabio Luporini and Michael Lange and Mathias Louboutin and Navjot Kukreja and Jan Hückelheim and Charles Yount and Philipp Witte and Paul H. J. Kelly and Felix J. Herrmann and Gerard J. Gorman
2020 arXiv   pre-print
Devito is a framework capable of generating highly-optimized code given symbolic equations expressed in Python, specialized in, but not limited to, affine (stencil) codes.  ...  Several performance optimizations are introduced, including advanced common sub-expressions elimination, tiling and parallelization.  ...  YLE-YASK Loop Engine YASK-Yet Another Stencil Kit 2 -is an open-source C++ software framework for generating highperformance implementations of stencil codes for Intel ® Xeon ® and Intel ® Xeon Phi TM  ... 
arXiv:1807.03032v3 fatcat:p6ljii55pfcsvpdujjift64ex4

Download Complete Issue: CP68 (8MB)

Full Issue
2011 Cartographic Perspectives  
The Collections section article for this issue was written by MaryJo Price, Special Maps Librarian of the Lewis J. Ort Library at Frostburg State University.  ...  There is a recent recovery of these two mapping approaches in arts and humanities (see the recent special issue of Cartographic Perspectives [53] on mappings and the arts).  ... 
doi:10.14714/cp68.38 fatcat:msmzk5rmhzeapgtssx5pz5gddq

Productive Performance Engineering for Weather and Climate Modeling with Python [article]

Tal Ben-Nun, Linus Groner, Florian Deconinck, Tobias Wicky, Eddie Davis, Johann Dahm, Oliver D. Elbert, Rhea George, Jeremy McGibbon, Lukas Trümper, Elynn Wu, Oliver Fuhrer (+2 others)
2022 arXiv   pre-print
Earth system models are developed with a tight coupling to target hardware, often containing specialized code predicated on processor characteristics.  ...  By using a declarative Python-embedded stencil domain-specific language and data-centric optimization, we abstract hardware-specific details and define a semi-automated workflow for analyzing and optimizing  ...  The authors also wish to acknowledge the support from the PASC program (Platform for Advanced Scientific Computing) for the DaceMI project.  ... 
arXiv:2205.04148v2 fatcat:rhvxrwd4frabtboumy53ek7zaq

Download Complete Issue: CP72 (13.4MB)

Complete Issue
2013 Cartographic Perspectives  
No explanation is given for this special treatment.  ...  Add 'Timespan' code shown in 3. Since this code is specific to the Swan Island, TN, cross-section and location, fields within this timespan code will need to be replaced.  ... 
doi:10.14714/cp72.429 fatcat:x5dph4ifqvestfyjur6t3evn3i

The Library World Volume 10 Issue 2

1907 New Library World  
-At the end of the report recently issued by Mr.  ...  -In last month's issue we noted the opening of the new Public Library.  ...  ., made before the Library Association in 1900: "That a committee of the Association should be appointed to draw up a code of rules for the cataloguing for Local Collections so as to secure uniformity  ... 
doi:10.1108/eb008905 fatcat:e7oh2fwvnjfl3nhskpegzwfxka

Page 182 of Chemical Engineering Vol. 56, Issue 12 [page]

1949 Chemical Engineering  
Chemical Engineer ing’s subscription records and stencils have been changed to extend all sub scriptions affected by the change in rate.  ...  Write for proposal on your . req nrements Pressure Vessels Galvamzing Kettles Annealing Covers Tin Pots Salt Annealing Pots Wire Annealing Pots Special Welded Tanks ANNEALING BOX COMPANY APIL-ASME Codes  ... 

Compiling Parallel MATLAB for General Distributions using Telescoping Languages

Mary Fletcher, Cheryl McCosh, Guohua Jin, Ken Kennedy
2007 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07  
The two codes tested in this experiment are a 1D 2-point stencil using two double precision arrays each with 16M elements, and a 2D 4-point stencil using two 4K × 4K double precision arrays.  ...  This paper gives an overview of our strategy for Matlab D, focusing on the issue of data movement.  ... 
doi:10.1109/icassp.2007.367289 dblp:conf/icassp/FletcherMJK07 fatcat:mzrdb7rd35dqdmljtzwlj6mpry
« Previous Showing results 1 — 15 out of 3,049 results