11 Hits in 5.5 sec

A Scala Prototype to Generate Multigrid Solver Implementations for Different Problems and Target Multi-Core Platforms [article]

Harald Koestler, Christian Schmitt, Sebastian Kuckuk, Frank Hannig, Juergen Teich, Ulrich Ruede
2014 arXiv   pre-print
Two different test problems showcase our proposed automatic generation of multigrid solvers for both CPU and GPU target platforms.  ...  In this article we provide a prototype implementation in Scala for a framework that allows abstract descriptions of PDEs, their discretization, and their numerical solution via multigrid algorithms.  ...  The first funding period runs from January 2013 to December 2015.  ... 
arXiv:1406.5369v1 fatcat:m644lfexcnhrxmpgxi4ab2pouy

ExaSlang: A Domain-Specific Language for Highly Scalable Multigrid Solvers

Christian Schmitt, Sebastian Kuckuk, Frank Hannig, Harald Kostler, Jurgen Teich
2014 2014 Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing  
We propose ExaSlang, a language for the specification of numerical solvers based on the multigrid method targeting distributedmemory systems.  ...  As a remedy, domain-specific languages (DSLs) are a convenient technology for domain experts to describe settings and problems they want to solve using terms and models familiar to them.  ...  In previous work, we have demonstrated the feasibility of this approach by showcasing a simple prototype compiler for the generation of parallel multigrid solvers with low variability [12] .  ... 
doi:10.1109/wolfhpc.2014.11 dblp:conf/sc/SchmittKHKT14 fatcat:ut6aplld25d3pj4ibh5o4lzifu

Code generation approaches for parallel geometric multigrid solvers

Harald Köstler, Marco Heisig, Nils Kohl, Sebastian Kuckuk, Martin Bauer, Ulrich Rüde
2020 Analele Stiintifice ale Universitatii Ovidius Constanta: Seria Matematica  
In contrast to manual implementations in a general-purpose computing language, they allow to integrate automatic code transforms to produce efficient code for different models and platforms.  ...  As an example the numerical solution of an elliptic partial differential equation via generated geometric multigrid solvers is considered.  ...  programme SPP 1648 "Software for Exascale Computing" through the projects TerraNeo and ExaStencils.  ... 
doi:10.2478/auom-2020-0038 fatcat:rr63bsabc5cbbbcw7eljtr3gzy

ExaStencils: Advanced Multigrid Solver Generation [chapter]

Christian Lengauer, Sven Apel, Matthias Bolten, Shigeru Chiba, Ulrich Rüde, Jürgen Teich, Armin Größlinger, Frank Hannig, Harald Köstler, Lisa Claus, Alexander Grebhahn, Stefan Groth (+5 others)
2020 Lecture Notes in Computational Science and Engineering  
At every layer, the corresponding language expresses not only computational directives but also domain knowledge of the problem and platform to be leveraged for optimization.  ...  and the most concrete a full, automatically generated implementation.  ...  We thank the Jülich Supercomputing Center (JSC) for providing access to the supercomputers JUQUEEN and JURECA and the Swiss National Supercomputing Centre (CSCS) for providing computational resources and  ... 
doi:10.1007/978-3-030-47956-5_14 fatcat:xhbxnt45ynhilgh6vv2ui2n76i

A UPC++ Actor Library and Its Evaluation On a Shallow Water Proxy Application

Alexander Pppl, Scott Baden, Michael Bader
2019 2019 IEEE/ACM Parallel Applications Workshop, Alternatives To MPI (PAW-ATM)  
This architecture features multiple smaller groups of CPU cores (called tiles) that share a cache hierarchy and a memory. The different tiles are connected using a Network-on-Chip.  ...  The most widely used approach here is to use MPI for inter-node communication and parallelization, and OpenMP for the on-node parallelization.  ...  on the loops and by requiring the compiler to generate vectorized versions of the Riemann solver functions using #pragma omp declare simd.  ... 
doi:10.1109/paw-atm49560.2019.00007 dblp:conf/sc/PopplBB19 fatcat:ecudfm2vvvavfbkxqd7um4ajoe

From constraint programming to heterogeneous parallelism [article]

Philip Andreas Ginsbach, University Of Edinburgh, University Of Edinburgh, Michael O'Boyle, Bjoern Franke, Adam Lopez
The scaling limitations of multi-core processor development have led to a diversification of the processor cores used within individual computers.  ...  The detection of computational idioms in their middle end enables compilers to incorporate DSL and library backends for code generation.  ...  For the programs where idioms dominate execution time, accelerator code was generated and evaluated on 3 platforms: a multi-core CPU, an integrated APU, and an external GPU.  ... 
doi:10.7488/era/309 fatcat:7o26d74r6rb5hkp2wiutvehv3u

Aspects of Code Generation and Data Transfer Techniques for Modern Parallel Architectures

Manuel Mohr
Eine radikale Lösung dieses Problems stellt die Abschaffung der globalen Cache-Kohärenz dar.  ...  Popcorn [Bar+15] modifies Linux to run on platforms consisting of multiple OS-capable multi-core processors with different ISAs, such as a regular x86 multi-core extended with a PCIe-based Intel Xeon  ...  As we use one of these prototypes for our evaluation in Chapter 4 and a derived prototype platform for our evaluation in Chapter 5, we give a brief overview of the platform's characteristics.  ... 
doi:10.5445/ir/1000085052 fatcat:5omn7z2o3jhtra5wcxj4k3slmu

A UPC++ Actor Library and Its Evaluation On a Shallow Water Proxy Application

Pöppl, Alexander; Baden, S.; Bader, Michael
This architecture features multiple smaller groups of CPU cores (called tiles) that share a cache hierarchy and a memory. The different tiles are connected using a Network-on-Chip.  ...  The most widely used approach here is to use MPI for inter-node communication and parallelization, and OpenMP for the on-node parallelization.  ...  We currently have a prototypic implementation of this in SWE-X10. Pond may be viewed as an amalgamation of SWE and SWE-X10.  ... 
doi:10.25344/s43g60 fatcat:wqhgrdl4sjfbzhuneklalzwnyu

Tomography of the Earth's Crust – From Geophysical Sounding to Real-Time Monitoring : Status Seminar 2. May 2011, GFZ, German Research Centre for Geosciences, Potsdam ; Programme & Abstracts

Münch U., U. Münch
A lot of technological developments and innovations have been made in recent years like the real-time data acquisition andevaluation in addition to computer-aided visualization programmes.But there is  ...  still the need to integrate und combine different methods in particular inversion methods.This volume summarizes the scientific goals and first results presented during the kick-off seminar at the GFZ  ...  Acknowledgements The project MuSaWa is part of the R&D programme GEOTECHNOLOGIEN and is funded by the German Ministry for Education and Research, grant 03G0745A.  ... 
doi:10.2312/ fatcat:eifbydnlcbh5tjl6d4rjii5egu

Resource-aware Programming in a High-level Language - Improved performance with manageable effort on clustered MPSoCs

Andreas Zwinkau
The primary hardware platforms being targeted by the language are clusters of multi-core processors linked together into a large scale system via a high-performance network.  ...  For a concrete implementation, I build on the platform described in chapter 3 and built with a large team within Invasive Computing.  ...  A "hang" means the application locked up somehow. All cores are idle. All activities and i-lets are blocked. A "crash" means the application terminated with an error.  ... 
doi:10.5445/ir/1000083526 fatcat:3yzywk3pnvg25fmi32xcik6kwi

Code Generation for High Performance PDE Solvers on Modern Architectures

Dominic Kempf
This is both due to a lack of numerical algorithms suited for the hardware and efficient implementations of these algorithms not being available.  ...  This work proposes generative programming as a solution to this issue.  ...  This can be explained with the high total number of FMA chains needed to saturate the two floating point ports of the processor: The pipeline depth  ... 
doi:10.11588/heidok.00027360 fatcat:sjn764xlsbcy7krvcw4so24q5q