A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Zero Overhead Modern C++ For Mapping To Any Programming Model
2018
Zenodo
Widera et al., ISC 2016, DOI: 10.1007/978-3-319-46079-6_21 frame 0
frame 1
frame 2
frame 3
frame 4
n=n max
n=n max
n=n max
n=n max
n<n max
momentum
... ...
doi:10.5281/zenodo.1304271
fatcat:xqh2ghsfqvbppbmppwie54khbe
Porting the Plasma Simulation PIConGPU to Heterogeneous Architectures with Alpaka
2016
Zenodo
René Widera, Erik Zenker, Guido Juckeland • Computational Radiation Physics • www.hzdr.de/crp { r.widera, e.zenker, g.juckeland }@hzdr.de ─ + René Widera, Erik Zenker, Guido Juckeland • Computational Radiation ...
Widera, Erik Zenker, Guido Juckeland • Computational Radiation Physics • www.hzdr.de/crp { r.widera, e.zenker, g.juckeland }@hzdr.de Structures for Particles and Fields René Widera, Erik Zenker, Guido ...
doi:10.5281/zenodo.6336086
fatcat:nfx467scrjhmvk3cjvgn2wwhse
Metrics and Design of an Instruction Roofline Model for AMD GPUs
[article]
2021
arXiv
pre-print
Due to the recent announcement of the Frontier supercomputer, many scientific application developers are working to make their applications compatible with AMD architectures (CPU-GPU), which means moving away from the traditional CPU and NVIDIA-GPU systems. Due to the current limitations of profiling tools for AMD GPUs, this shift leaves a void in how to measure application performance on AMD GPUs. In this paper, we design an instruction roofline model for AMD GPUs using AMD's ROCProfiler and a
arXiv:2110.08221v2
fatcat:uusc263lufd7fiu6dpm6qupi5e
more »
... benchmarking tool, BabelStream (the HIP implementation), as a way to measure an application's performance in instructions and memory transactions on new AMD hardware. Specifically, we create instruction roofline models for a case study scientific application, PIConGPU, an open source particle-in-cell (PIC) simulations application used for plasma and laser-plasma physics on the NVIDIA V100, AMD Radeon Instinct MI60, and AMD Instinct MI100 GPUs. When looking at the performance of multiple kernels of interest in PIConGPU we find that although the AMD MI100 GPU achieves a similar, or better, execution time compared to the NVIDIA V100 GPU, profiling tool differences make comparing performance of these two architectures hard. When looking at execution time, GIPS, and instruction intensity, the AMD MI60 achieves the worst performance out of the three GPUs used in this work.
LLAMA: The Low-Level Abstraction For Memory Access
[article]
2021
arXiv
pre-print
René Widera finished his education as IT specialist for application development at the Technical University Dresden (Germany) in 2009. ...
Rene Widera provided several additional foundational ideas on which LLAMA was built and
Conflict of interest The authors declare no potential conflict of interests. ...
arXiv:2106.04284v2
fatcat:pg2m3hbcrfhc5ip5g6glgnlf4i
Antibody mediated neutralization of authentic SARS-CoV-2 B.1.617 variants harboring L452R and T478K/E484Q
[article]
2021
medRxiv
pre-print
The capacity of convalescent and vaccine-elicited sera and monoclonal antibodies (mAb) to neutralize SARS-CoV-2 variants is currently of high relevance to assess the protection against infections. We performed a cell culture-based neutralization assay focusing on authentic SARS-CoV-2 variants B.1.617.1 (Kappa), B.1.617.2 (Delta), B.1.427/B.1.429 (Epsilon), all harboring the spike substitution L452R. We found that authentic SARS-CoV-2 variants harboring L452R had reduced susceptibility to
doi:10.1101/2021.08.09.21261704
fatcat:v6geon6skvcbre6nuq3l5in4um
more »
... scent and vaccine-elicited sera and mAbs. Compared to B.1, Kappa and Delta showed a reduced neutralization by convalescent sera by a factor of 5.71 and 3.64, respectively, which constitutes a 2-fold greater reduction when compared to Epsilon. BNT2b2 and mRNA1273 vaccine-elicited sera were less effective against Kappa, Delta, and Epsilon compared to B.1. No difference was observed between Kappa and Delta towards vaccine-elicited sera, whereas convalescent sera were 1.6-fold less effective against Delta, respectively. Both B.1.617 variants Kappa (+E484Q) and Delta (+T478K) were less susceptible to either casirivimab or imdevimab. In conclusion, in contrast to the parallel circulating Kappa variant, the neutralization efficiency of convalescent and vaccine-elicited sera against Delta was moderately reduced. Delta was resistant to imdevimab, which however, might be circumvented by a combination therapy with casirivimab together.
Bamlanivimab does not neutralize two SARS-CoV-2 variants carrying E484K in vitro
[article]
2021
medRxiv
pre-print
The IgG1 monoclonal antibody (mAb) bamlanivimab (LY-CoV555) prevents viral attachment and entry into human cells by blocking attachment to the ACE2 receptor. However, whether bamlanivimab is equally effective against SARS-CoV-2 emerging variants of concern (VOC) is not fully known. Hence, the aim of this study was to determine whether bamlanivimab is equally effective against SARS-CoV-2 emerging VOC. The ability of bamlanivimab to neutralize five SARS-CoV-2 variants including B.1.1.7 (mutations
doi:10.1101/2021.02.24.21252372
fatcat:fioxkqln55a3jlxlxuh62ukuiq
more »
... include N501Y and del69/70), B.1.351 (mutations include E484K and N501Y) and P.2 (mutations include E484K in the absence of a N501Y mutation) was analyzed in infectious cell culture using CaCo2 cells. Additionally, we analyzed vaccine-elicited sera after immunization with BNT162b2, and convalescent sera for its ability to neutralize SARS-CoV-2 variants. We found that the variant B.1.1.7, as well as two isolates from early 2020 (FFM1 and FFM7) could be efficiently neutralized by bamlanivimab (titer 1/1280, respectively), however, no neutralization effect could be detected against either B.1.135 or P.2, both harboring the E484K substitution. Vaccine-elicited sera showed slightly decreased neutralizing activity against B1.1.7, B.1.135 and P.2 Our in vitro findings indicate that, in contrast to vaccine-elicited sera, bamlanivimab may not provide efficacy against SARS-CoV-2 variants harboring the E484K substitution. Confirmation of the SARS-CoV-2 variant, including screening for E484K, may be needed before initiating mAb treatment with bamlanivimab to ensure both efficacious and efficient use of the antibody product. Hence, variant-specific mAb agents may be required to treat emerging VOC.
Talk "On The Scalability Of Data Reduction Techniques In Current And Upcoming Hpc Systems From An Application Perspective"
2017
Zenodo
Talk presented at "The 1st International Workshop on Data Reduction for Big Scientific Data (DRBSD-1)" held in conjunction with ISC 2017 in Frankfurt, Germany. Accompanying the paper with the same title.
doi:10.5281/zenodo.1000736
fatcat:on5cld7kwvgtffhsqralgt56xq
Spectral Control via Multi-Species Effects in PW-Class Laser-Ion Acceleration
[article]
2020
arXiv
pre-print
Laser-ion acceleration with ultra-short pulse, PW-class lasers is dominated by non-thermal, intra-pulse plasma dynamics. The presence of multiple ion species or multiple charge states in targets leads to characteristic modulations and even mono-energetic features, depending on the choice of target material. As spectral signatures of generated ion beams are frequently used to characterize underlying acceleration mechanisms, thermal, multi-fluid descriptions require a revision for predictive
arXiv:1903.06428v2
fatcat:rrhlj4per5fnbciklqjsy4gu5e
more »
... ilities and control in next-generation particle beam sources. We present an analytical model with explicit inter-species interactions, supported by extensive ab initio simulations. This enables us to derive important ensemble properties from the spectral distribution resulting from those multi-species effects for arbitrary mixtures. We further propose a potential experimental implementation with a novel cryogenic target, delivering jets with variable mixtures of hydrogen and deuterium. Free from contaminants and without strong influence of hardly controllable processes such as ionization dynamics, this would allow a systematic realization of our predictions for the multi-species effect.
Scalable Multi-Platform PIC Simulations as an Open Science Service
2018
Zenodo
PIConGPU is a fully open, community-driven, 3D and 2D3V particle-in-cell code for the age of heterogeneous, many-core driven supercomputing. Developed in a single source C++ code base, PIConGPU supports both "traditional" CPU architectures as well as modern and highly parallel architectures such as OpenPOWER, Xeon Phi, and Nvidia GPUs. PIConGPU has shown to be suitable for production runs on the full system size of TOP5 clusters such as Titan (ORNL) and Piz Daint (CSCS). Machines like those
doi:10.5281/zenodo.1345079
fatcat:rg4nhdlqp5gcxa3ldgzq4l4a5e
more »
... le few-hour turnarounds for full 3D3V simulations on complex studies such as laser-ion acceleration from mass-limited targets, long-scale laser-wakefield acceleration with high bunch charges, and hybrid acceleration schemes. The resulting output of systematic parameter scans (PBytes+) raises a severe challenge for data centers. We address these issues with modern IO frameworks, performance modeling, and in situ data reduction techniques. Using such online methods we can investigate a wide range of observables relevant for experiments and run dozens of simulations at the same time frame as an experimental beam time. PIConGPU is further complemented by modern methods for photon generation, transport, as well as X-ray interaction. This simulation framework aims to provide documented, installable, and re-usable software components for the community, well-suited for open data (openPMD) and open science workflows without restrictions. Latest developments include a python-centric, extensive framework for specific experiments, which provides all of the above in an intuitive, non-expert user interface.
Antibody-Mediated Neutralization of Authentic SARS-CoV-2 B.1.617 Variants Harboring L452R and T478K/E484Q
2021
Viruses
The capacity of convalescent and vaccine-elicited sera and monoclonal antibodies (mAb) to neutralize SARS-CoV-2 variants is currently of high relevance to assess the protection against infections. We performed a cell culture-based neutralization assay focusing on authentic SARS-CoV-2 variants B.1.617.1 (Kappa), B.1.617.2 (Delta), B.1.427/B.1.429 (Epsilon), all harboring the spike substitution L452R. We found that authentic SARS-CoV-2 variants harboring L452R had reduced susceptibility to
doi:10.3390/v13091693
pmid:34578275
pmcid:PMC8473269
fatcat:a3nkmdxjbra5plgaezrd7atnna
more »
... scent and vaccine-elicited sera and mAbs. Compared to B.1, Kappa and Delta showed a reduced neutralization by convalescent sera by a factor of 8.00 and 5.33, respectively, which constitutes a 2-fold greater reduction when compared to Epsilon. BNT2b2 and mRNA1273 vaccine-elicited sera were less effective against Kappa, Delta, and Epsilon compared to B.1. No difference was observed between Kappa and Delta towards vaccine-elicited sera, whereas convalescent sera were 1.51-fold less effective against Delta, respectively. Both B.1.617 variants Kappa (+E484Q) and Delta (+T478K) were less susceptible to either casirivimab or imdevimab. In conclusion, in contrast to the parallel circulating Kappa variant, the neutralization efficiency of convalescent and vaccine-elicited sera against Delta was moderately reduced. Delta was resistant to imdevimab, which, however, might be circumvented by combination therapy with casirivimab together.
Talk "Next-Generation Simulations For Xfel-Plasma Interactions With Solid Density Targets With Picongpu"
2017
Zenodo
PIConGPU reportedly is the fastest particle-in-cell code in the world with respect to sustained Flop/s. Written in performance-portable, single-source C++ we constantly push the envelope towards Exascale laser-plasma modeling. However, solving previously week-long simulation tasks in a few hours with a speedy framework is only the beginning. This talk will present the architecture and recent additions driving PIConGPU. As we speak, we run on the fastest machines and the community approaches a
doi:10.5281/zenodo.1001894
fatcat:wpn7tet6x5ebljivdwjyohhrvi
more »
... w generation of TOP10 clusters. Within those, many-core computing architectures and severe limitations in available I/O bandwidth demand fundamental rethinking of established modeling workflows towards in situ-processing. We present our ready-to-use open-source solutions and address scientific repeatability, data-reduction in I/O, predictability and new atomic modeling for XFEL pump-probe experiments.
Spectral control via multi-species effects in PW-class laser-ion acceleration
2020
Plasma Physics and Controlled Fusion
://orcid.org/0000-0001-6200-6406 Lieselotte Obst-Huebl https://orcid.org/0000-0001-9236-8037 Tim Ziegler https://orcid.org/0000-0002-3727-7017 Marco Garten https://orcid.org/0000-0001-6994-2475 René ...
Widera https://orcid.org/0000-0003-1642-0459 Karl Zeil https://orcid.org/0000-0003-3926-409X Thomas E Cowan https://orcid.org/0000-0002-5845-000X Michael Bussmann https://orcid.org/0000-0002-8258 ...
doi:10.1088/1361-6587/abbe33
fatcat:vgbfknfodjetzpyx4gxj35vhym
Exascale Laser Plasma Physics - From Computational Speed to Predictions
2019
Zenodo
Invited presentation given by Axel Huebl (HZDR, Germany) at the EPS 2019 conference in Milano (Italy) on July 11th, 2019.
doi:10.5281/zenodo.3332813
fatcat:6xlq64nycjgh3acn52xjs5hn2i
Alpaka -- An Abstraction Library for Parallel Kernel Acceleration
2016
2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
Porting applications to new hardware or programming models is a tedious and error prone process. Every help that eases these burdens is saving developer time that can then be invested into the advancement of the application itself instead of preserving the status-quo on a new platform. The Alpaka library defines and implements an abstract hierarchical redundant parallelism model. The model exploits parallelism and memory hierarchies on a node at all levels available in current hardware. By
doi:10.1109/ipdpsw.2016.50
dblp:conf/ipps/ZenkerWWHJKNB16
fatcat:zgvpbvbeeneuznpica7l6h2owm
more »
... so, it allows to achieve platform and performance portability across various types of accelerators by ignoring specific unsupported levels and utilizing only the ones supported on a specific accelerator. All hardware types (multi- and many-core CPUs, GPUs and other accelerators) are supported for and can be programmed in the same way. The Alpaka C++ template interface allows for straightforward extension of the library to support other accelerators and specialization of its internals for optimization. Running Alpaka applications on a new (and supported) platform requires the change of only one source code line instead of a lot of \#ifdefs.
Challenges Porting a C++ Template-Metaprogramming Abstraction Layer to Directive-based Offloading
[article]
2022
arXiv
pre-print
HPC systems employ a growing variety of compute accelerators with different architectures and from different vendors. Large scientific applications are required to run efficiently across these systems but need to retain a single code-base in order to not stifle development. Directive-based offloading programming models set out to provide the required portability, but, to existing codes, they themselves represent yet another API to port to. Here, we present our approach of porting the
arXiv:2110.08650v2
fatcat:65k2l6te6baldcjhsqjcvbgupq
more »
... ated particle-in-cell code PIConGPU to OpenACC and OpenMP target by adding two new backends to its existing C++-template metaprogramming-based offloading abstraction layer alpaka and avoiding other modifications to the application code. We introduce our approach in the face of conflicts between requirements and available features in the standards as well as practical hurdles posed by immature compiler support.
« Previous
Showing results 1 — 15 out of 73 results