Filters








10 Hits in 10.9 sec

Tuning and Optimization for a Variety of Many-Core Architectures Without Changing a Single Line of Implementation Code Using the Alpaka Library [chapter]

Alexander Matthes, René Widera, Erik Zenker, Benjamin Worpitz, Axel Huebl, Michael Bussmann
2017 Lecture Notes in Computer Science  
We present an analysis on optimizing performance of a single C++11 source code using the Alpaka hardware abstraction library.  ...  For this we use the general matrix multiplication (GEMM) algorithm in order to show that compilers can optimize Alpaka code effectively when tuning key parameters of the algorithm.  ...  Conclusion Within the scope of this work we have shown that portable single-source C++11 code using Alpaka can run on current many-core architectures without changing any line inside the algorithmic relevant  ... 
doi:10.1007/978-3-319-67630-2_36 fatcat:k76veiy34zdtzbbxpvx6kxw23q

Alpaka -- An Abstraction Library for Parallel Kernel Acceleration

Erik Zenker, Benjamin Worpitz, Rene Widera, Axel Huebl, Guido Juckeland, Andreas Knupfer, Wolfgang E. Nagel, Michael Bussmann
2016 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)  
Running Alpaka applications on a new (and supported) platform requires the change of only one source code line instead of a lot of \#ifdefs.  ...  The Alpaka C++ template interface allows for straightforward extension of the library to support other accelerators and specialization of its internals for optimization.  ...  Parallel performance currently relies on the efficient use of many-core architectures that are commonly found in a heterogeneous environment of multi-core CPU and many-core accelerator hardware.  ... 
doi:10.1109/ipdpsw.2016.50 dblp:conf/ipps/ZenkerWWHJKNB16 fatcat:zgvpbvbeeneuznpica7l6h2owm

Investigating Performance Portability Of A Highly Scalable Particle-In-Cell Simulation Code On Various Multi-Core Architectures

Benjamin Worpitz, Prof. Dr. Wolfgang E. Nagel, Dr. Michael Bussmann, Dr. Guido Juckeland, Dr. Andreas Knüpfer, Dr. Bernd Trenkler
2015 Zenodo  
The alpaka library defines and implements an abstract hierarchical redundant parallelism model.  ...  The C++ template interface provided allows for straightforward extension of the library to support other accelerators and specialization of its internals for optimization.  ...  All of this can be done independently by users without having to change any line of code of the existing alpaka library due to its modular, non-intrusive, extensible design.  ... 
doi:10.5281/zenodo.49768 fatcat:gw53fnzwxfa53n2xqg6dpohqle

Parallel Programming Models for Heterogeneous Many-Cores : A Survey [article]

Jianbin Fang, Chun Huang, Tao Tang, Zheng Wang
2020 arXiv   pre-print
In this article, we provide a comprehensive survey for parallel programming models for heterogeneous many-core architectures and review the compiling techniques of improving programmability and portability  ...  We provide a road map for a wide variety of different research areas. We conclude with a discussion on open issues in the area and potential research directions.  ...  Porting OpenCL to a new many-core device is a matter of providing an implementation of the runtime library that conforms to the standard, achieving the goal of code portability [90] .  ... 
arXiv:2005.04094v1 fatcat:e2psrdnyajh3hih3znnjjbezae

Parallel programming models for heterogeneous many-cores: a comprehensive survey

Jianbin Fang, Chun Huang, Tao Tang, Zheng Wang
2020 CCF Transactions on High Performance Computing  
In this article, we provide a comprehensive survey for parallel programming models for heterogeneous many-core architectures and review the compiling techniques of improving programmability and portability  ...  We provide a road map for a wide variety of different research areas. We conclude with a discussion on open issues in the area and potential research directions.  ...  Porting OpenCL to a new many-core device is a matter of providing an implementation of the runtime library that conforms to the standard, achieving the goal of code portability .  ... 
doi:10.1007/s42514-020-00039-4 fatcat:nn56xhjm6rcu7kya6gfnyjg66q

HL-LHC Analysis With ROOT [article]

Axel Naumann, Philippe Canal, Enric Tejedor, Enrico Guiraud, Lorenzo Moneta, Bertrand Bellenot, Olivier Couet, Alja Mrak Tadel, Matevz Tadel, Sergey Linev, Javier Lopez Gomez, Jonas Rembser (+5 others)
2022 arXiv   pre-print
With another significant increase in the amount of data to be handled scheduled to arrive in 2027, ROOT is preparing for a massive upgrade of its core ingredients.  ...  As part of a review of crucial software for high energy physics, the ROOT team has documented its R&D plans for the coming years.  ...  In the future, users will be able to distribute the same computation graph over a set of different cluster frameworks by changing a single line of code.  ... 
arXiv:2205.06121v1 fatcat:f7rk3km77feifmiqz6dia4sq5y

Specification Of Hpc Hardware And Program Components To Enable Further Optimized Mappings

Carlchristian Helmut Johannes Eckert, Wolfgang E Nagel, Jerónimo Castrillón
2016 Zenodo  
Through these features, Dodo forms the base for tools that can specialize in the creation of optimized domain decompositions and mappings.  ...  All data structures are based on widely used libraries and can be interfaced with third-party tools.  ...  Figure 6 . 1 : 61 Figure 6.1: Comparison of code lines for the Game Of Life implementation.  ... 
doi:10.5281/zenodo.163329 fatcat:mkb4ewc4nvdfrcclwlqzrbg7ry

PIConGPU: Predictive Simulations of Laser-Particle Accelerators with Manycore Hardware

Axel Huebl, Michael Dr. Bussmann, Thomas Dr. Kluge, Ulrich Prof. Dr. Schramm, Thomas E. Prof. Dr. Cowan, Paul Prof. Dr. Gibbon, Burkhardt Prof. Dr. Kämpfer, Stefan PD Dr. Grafström
2019 Zenodo  
PIConGPU is designed with a modular and extensible implementation, allowing to compute on current and upcoming hardware from a single code base.  ...  The latter are of special interest as they may provide a compact source for energetic ion beams.  ...  "Tuning and optimization for a variety of many-core architectures without changing a single line of implementation code using the alpaka library".  ... 
doi:10.5281/zenodo.3266819 fatcat:b73a3zvxvjecjplcsks7gf52cu

PUNCH4NFDI Consortium Proposal

The PUNCH4NFDI Consortium
2020 Zenodo  
that offers all storage and compute opportunities and all data transformation possibilities required for making the data fully productive in a sustainable way.  ...  Organised in 7 task areas, the consortium ultimately aims at establishing FAIR digital research products for its communities and beyond, spending their entire lifecycle inside a "science data platform"  ...  WP 3.2 addresses the challenge of optimally using heterogeneous and evolving architectures for numerical methods and simulations.  ... 
doi:10.5281/zenodo.5722894 fatcat:mfcvk55kqvgsthkp6dnxqyiyve

HL-LHC Computing Review Stage-2, Common Software Projects: Event Generators

Efe Yazgan, Josh McFayden, Andrea Valassi, Simone Amoroso, Enrico Bothmann, Andy Buckley, John Campbell, Gurpreet Singh Chahal, Taylor Childers, Gloria Corti, Rikkert Frederix, Stefano Frixione (+26 others)
2021
software for HL-LHC, which has since been updated and published, and which we are also submitting to the November 2021 review as an integral part of our contribution.  ...  It complements previous documents prepared by the WG in the context of the first phase of the LHCC review in 2020, including in particular the WG paper on the specific challenges in Monte Carlo event generator  ...  Professor [84] is an MC-tuning and general parametrization and optimization tool used to make many of the main LHC MC-generator tunes.  ... 
doi:10.3204/pubdb-2021-04727 fatcat:bir45trjlvgdrh4lv7m2vkq4um