Filters








73 Hits in 0.58 sec

Zero Overhead Modern C++ For Mapping To Any Programming Model

Axel Huebl, Alexander Matthes, Benjamin Worpitz, Erik Zenker, René Widera, Guido Juckeland, Michael Bussmann
2018 Zenodo  
Widera et al., ISC 2016, DOI: 10.1007/978-3-319-46079-6_21 frame 0 frame 1 frame 2 frame 3 frame 4 n=n max n=n max n=n max n=n max n<n max momentum ...  ... 
doi:10.5281/zenodo.1304271 fatcat:xqh2ghsfqvbppbmppwie54khbe

Porting the Plasma Simulation PIConGPU to Heterogeneous Architectures with Alpaka

Rene Widera, Erik Zenker, Guido Juckeland, Benjamin Worpitz, Axel Huebl, Andreas Knuepfer, Wolfgang E Nagel, Michael Bussmann
2016 Zenodo  
René Widera, Erik Zenker, Guido Juckeland • Computational Radiation Physics • www.hzdr.de/crp { r.widera, e.zenker, g.juckeland }@hzdr.de ─ + René Widera, Erik Zenker, Guido Juckeland • Computational Radiation  ...  Widera, Erik Zenker, Guido Juckeland • Computational Radiation Physics • www.hzdr.de/crp { r.widera, e.zenker, g.juckeland }@hzdr.de Structures for Particles and Fields René Widera, Erik Zenker, Guido  ... 
doi:10.5281/zenodo.6336086 fatcat:nfx467scrjhmvk3cjvgn2wwhse

Metrics and Design of an Instruction Roofline Model for AMD GPUs [article]

Matthew Leinhauser, René Widera, Sergei Bastrakov, Alexander Debus, Michael Bussmann, Sunita Chandrasekaran
2021 arXiv   pre-print
Due to the recent announcement of the Frontier supercomputer, many scientific application developers are working to make their applications compatible with AMD architectures (CPU-GPU), which means moving away from the traditional CPU and NVIDIA-GPU systems. Due to the current limitations of profiling tools for AMD GPUs, this shift leaves a void in how to measure application performance on AMD GPUs. In this paper, we design an instruction roofline model for AMD GPUs using AMD's ROCProfiler and a
more » ... benchmarking tool, BabelStream (the HIP implementation), as a way to measure an application's performance in instructions and memory transactions on new AMD hardware. Specifically, we create instruction roofline models for a case study scientific application, PIConGPU, an open source particle-in-cell (PIC) simulations application used for plasma and laser-plasma physics on the NVIDIA V100, AMD Radeon Instinct MI60, and AMD Instinct MI100 GPUs. When looking at the performance of multiple kernels of interest in PIConGPU we find that although the AMD MI100 GPU achieves a similar, or better, execution time compared to the NVIDIA V100 GPU, profiling tool differences make comparing performance of these two architectures hard. When looking at execution time, GIPS, and instruction intensity, the AMD MI60 achieves the worst performance out of the three GPUs used in this work.
arXiv:2110.08221v2 fatcat:uusc263lufd7fiu6dpm6qupi5e

LLAMA: The Low-Level Abstraction For Memory Access [article]

Bernhard Manfred Gruber, Guilherme Amadio, Jakob Blomer, Alexander Matthes, René Widera, Michael Bussmann
2021 arXiv   pre-print
René Widera finished his education as IT specialist for application development at the Technical University Dresden (Germany) in 2009.  ...  Rene Widera provided several additional foundational ideas on which LLAMA was built and Conflict of interest The authors declare no potential conflict of interests.  ... 
arXiv:2106.04284v2 fatcat:pg2m3hbcrfhc5ip5g6glgnlf4i

Antibody mediated neutralization of authentic SARS-CoV-2 B.1.617 variants harboring L452R and T478K/E484Q [article]

Alexander Wilhelm, Tuna Toptan, Christiane Pallas, Timo Wolf, Udo Goetsch, Rene Gottschalk, Maria JGT Vehreschild, Sandra Ciesek, Marek Widera
2021 medRxiv   pre-print
The capacity of convalescent and vaccine-elicited sera and monoclonal antibodies (mAb) to neutralize SARS-CoV-2 variants is currently of high relevance to assess the protection against infections. We performed a cell culture-based neutralization assay focusing on authentic SARS-CoV-2 variants B.1.617.1 (Kappa), B.1.617.2 (Delta), B.1.427/B.1.429 (Epsilon), all harboring the spike substitution L452R. We found that authentic SARS-CoV-2 variants harboring L452R had reduced susceptibility to
more » ... scent and vaccine-elicited sera and mAbs. Compared to B.1, Kappa and Delta showed a reduced neutralization by convalescent sera by a factor of 5.71 and 3.64, respectively, which constitutes a 2-fold greater reduction when compared to Epsilon. BNT2b2 and mRNA1273 vaccine-elicited sera were less effective against Kappa, Delta, and Epsilon compared to B.1. No difference was observed between Kappa and Delta towards vaccine-elicited sera, whereas convalescent sera were 1.6-fold less effective against Delta, respectively. Both B.1.617 variants Kappa (+E484Q) and Delta (+T478K) were less susceptible to either casirivimab or imdevimab. In conclusion, in contrast to the parallel circulating Kappa variant, the neutralization efficiency of convalescent and vaccine-elicited sera against Delta was moderately reduced. Delta was resistant to imdevimab, which however, might be circumvented by a combination therapy with casirivimab together.
doi:10.1101/2021.08.09.21261704 fatcat:v6geon6skvcbre6nuq3l5in4um

Bamlanivimab does not neutralize two SARS-CoV-2 variants carrying E484K in vitro [article]

Marek Widera, Alexander Wilhelm, Sebastian Hoehl, Christiane Pallas, Niko Kohmer, Timo Wolf, Holger F Rabenau, Victor M Corman, Christian Drosten, Maria JGT Vehreschild, Udo Goetsch, Rene Gottschalk (+1 others)
2021 medRxiv   pre-print
The IgG1 monoclonal antibody (mAb) bamlanivimab (LY-CoV555) prevents viral attachment and entry into human cells by blocking attachment to the ACE2 receptor. However, whether bamlanivimab is equally effective against SARS-CoV-2 emerging variants of concern (VOC) is not fully known. Hence, the aim of this study was to determine whether bamlanivimab is equally effective against SARS-CoV-2 emerging VOC. The ability of bamlanivimab to neutralize five SARS-CoV-2 variants including B.1.1.7 (mutations
more » ... include N501Y and del69/70), B.1.351 (mutations include E484K and N501Y) and P.2 (mutations include E484K in the absence of a N501Y mutation) was analyzed in infectious cell culture using CaCo2 cells. Additionally, we analyzed vaccine-elicited sera after immunization with BNT162b2, and convalescent sera for its ability to neutralize SARS-CoV-2 variants. We found that the variant B.1.1.7, as well as two isolates from early 2020 (FFM1 and FFM7) could be efficiently neutralized by bamlanivimab (titer 1/1280, respectively), however, no neutralization effect could be detected against either B.1.135 or P.2, both harboring the E484K substitution. Vaccine-elicited sera showed slightly decreased neutralizing activity against B1.1.7, B.1.135 and P.2 Our in vitro findings indicate that, in contrast to vaccine-elicited sera, bamlanivimab may not provide efficacy against SARS-CoV-2 variants harboring the E484K substitution. Confirmation of the SARS-CoV-2 variant, including screening for E484K, may be needed before initiating mAb treatment with bamlanivimab to ensure both efficacious and efficient use of the antibody product. Hence, variant-specific mAb agents may be required to treat emerging VOC.
doi:10.1101/2021.02.24.21252372 fatcat:fioxkqln55a3jlxlxuh62ukuiq

Talk "On The Scalability Of Data Reduction Techniques In Current And Upcoming Hpc Systems From An Application Perspective"

Axel Huebl, René Widera, Felix Schmitt, Alexander Matthes, Norbert Podhorszki, Jong Youl Choi, Scott Klasky, Michael Bussmann
2017 Zenodo  
Talk presented at "The 1st International Workshop on Data Reduction for Big Scientific Data (DRBSD-1)" held in conjunction with ISC 2017 in Frankfurt, Germany. Accompanying the paper with the same title.
doi:10.5281/zenodo.1000736 fatcat:on5cld7kwvgtffhsqralgt56xq

Spectral Control via Multi-Species Effects in PW-Class Laser-Ion Acceleration [article]

Axel Huebl, Martin Rehwald, Lieselotte Obst-Huebl, Tim Ziegler, Marco Garten, René Widera, Karl Zeil, Thomas E. Cowan, Michael Bussmann, Ulrich Schramm, Thomas Kluge
2020 arXiv   pre-print
Laser-ion acceleration with ultra-short pulse, PW-class lasers is dominated by non-thermal, intra-pulse plasma dynamics. The presence of multiple ion species or multiple charge states in targets leads to characteristic modulations and even mono-energetic features, depending on the choice of target material. As spectral signatures of generated ion beams are frequently used to characterize underlying acceleration mechanisms, thermal, multi-fluid descriptions require a revision for predictive
more » ... ilities and control in next-generation particle beam sources. We present an analytical model with explicit inter-species interactions, supported by extensive ab initio simulations. This enables us to derive important ensemble properties from the spectral distribution resulting from those multi-species effects for arbitrary mixtures. We further propose a potential experimental implementation with a novel cryogenic target, delivering jets with variable mixtures of hydrogen and deuterium. Free from contaminants and without strong influence of hardly controllable processes such as ionization dynamics, this would allow a systematic realization of our predictions for the multi-species effect.
arXiv:1903.06428v2 fatcat:rrhlj4per5fnbciklqjsy4gu5e

Scalable Multi-Platform PIC Simulations as an Open Science Service

Axel Huebl, Richard Pausch, René Widera, Marco Garten, Alexander Debus, Ilja Goethel, Alexander Matthes, Felix Meyer, Benjamin Worpitz, Sebastian Starke, Jeffrey Kelling, Sophie Rudat (+6 others)
2018 Zenodo  
PIConGPU is a fully open, community-driven, 3D and 2D3V particle-in-cell code for the age of heterogeneous, many-core driven supercomputing. Developed in a single source C++ code base, PIConGPU supports both "traditional" CPU architectures as well as modern and highly parallel architectures such as OpenPOWER, Xeon Phi, and Nvidia GPUs. PIConGPU has shown to be suitable for production runs on the full system size of TOP5 clusters such as Titan (ORNL) and Piz Daint (CSCS). Machines like those
more » ... le few-hour turnarounds for full 3D3V simulations on complex studies such as laser-ion acceleration from mass-limited targets, long-scale laser-wakefield acceleration with high bunch charges, and hybrid acceleration schemes. The resulting output of systematic parameter scans (PBytes+) raises a severe challenge for data centers. We address these issues with modern IO frameworks, performance modeling, and in situ data reduction techniques. Using such online methods we can investigate a wide range of observables relevant for experiments and run dozens of simulations at the same time frame as an experimental beam time. PIConGPU is further complemented by modern methods for photon generation, transport, as well as X-ray interaction. This simulation framework aims to provide documented, installable, and re-usable software components for the community, well-suited for open data (openPMD) and open science workflows without restrictions. Latest developments include a python-centric, extensive framework for specific experiments, which provides all of the above in an intuitive, non-expert user interface.
doi:10.5281/zenodo.1345079 fatcat:rg4nhdlqp5gcxa3ldgzq4l4a5e

Antibody-Mediated Neutralization of Authentic SARS-CoV-2 B.1.617 Variants Harboring L452R and T478K/E484Q

Alexander Wilhelm, Tuna Toptan, Christiane Pallas, Timo Wolf, Udo Goetsch, Rene Gottschalk, Maria J. G. T. Vehreschild, Sandra Ciesek, Marek Widera
2021 Viruses  
The capacity of convalescent and vaccine-elicited sera and monoclonal antibodies (mAb) to neutralize SARS-CoV-2 variants is currently of high relevance to assess the protection against infections. We performed a cell culture-based neutralization assay focusing on authentic SARS-CoV-2 variants B.1.617.1 (Kappa), B.1.617.2 (Delta), B.1.427/B.1.429 (Epsilon), all harboring the spike substitution L452R. We found that authentic SARS-CoV-2 variants harboring L452R had reduced susceptibility to
more » ... scent and vaccine-elicited sera and mAbs. Compared to B.1, Kappa and Delta showed a reduced neutralization by convalescent sera by a factor of 8.00 and 5.33, respectively, which constitutes a 2-fold greater reduction when compared to Epsilon. BNT2b2 and mRNA1273 vaccine-elicited sera were less effective against Kappa, Delta, and Epsilon compared to B.1. No difference was observed between Kappa and Delta towards vaccine-elicited sera, whereas convalescent sera were 1.51-fold less effective against Delta, respectively. Both B.1.617 variants Kappa (+E484Q) and Delta (+T478K) were less susceptible to either casirivimab or imdevimab. In conclusion, in contrast to the parallel circulating Kappa variant, the neutralization efficiency of convalescent and vaccine-elicited sera against Delta was moderately reduced. Delta was resistant to imdevimab, which, however, might be circumvented by combination therapy with casirivimab together.
doi:10.3390/v13091693 pmid:34578275 pmcid:PMC8473269 fatcat:a3nkmdxjbra5plgaezrd7atnna

Talk "Next-Generation Simulations For Xfel-Plasma Interactions With Solid Density Targets With Picongpu"

Axel Huebl, René Widera, Richard Pausch, Marco Garten, Heiko Burau, Alexander Matthes, Benjamin Worpitz, Fabian Koller, Thomas Kluge, Jan Vorberger, Alexander Debus, Thomas Cowan (+3 others)
2017 Zenodo  
PIConGPU reportedly is the fastest particle-in-cell code in the world with respect to sustained Flop/s. Written in performance-portable, single-source C++ we constantly push the envelope towards Exascale laser-plasma modeling. However, solving previously week-long simulation tasks in a few hours with a speedy framework is only the beginning. This talk will present the architecture and recent additions driving PIConGPU. As we speak, we run on the fastest machines and the community approaches a
more » ... w generation of TOP10 clusters. Within those, many-core computing architectures and severe limitations in available I/O bandwidth demand fundamental rethinking of established modeling workflows towards in situ-processing. We present our ready-to-use open-source solutions and address scientific repeatability, data-reduction in I/O, predictability and new atomic modeling for XFEL pump-probe experiments.
doi:10.5281/zenodo.1001894 fatcat:wpn7tet6x5ebljivdwjyohhrvi

Spectral control via multi-species effects in PW-class laser-ion acceleration

Axel Huebl, Martin Rehwald, Lieselotte Obst-Huebl, Tim Ziegler, Marco Garten, René Widera, Karl Zeil, Thomas E Cowan, Michael Bussmann, Ulrich Schramm, Thomas Kluge
2020 Plasma Physics and Controlled Fusion  
://orcid.org/0000-0001-6200-6406 Lieselotte Obst-Huebl  https://orcid.org/0000-0001-9236-8037 Tim Ziegler  https://orcid.org/0000-0002-3727-7017 Marco Garten  https://orcid.org/0000-0001-6994-2475 René  ...  Widera  https://orcid.org/0000-0003-1642-0459 Karl Zeil  https://orcid.org/0000-0003-3926-409X Thomas E Cowan  https://orcid.org/0000-0002-5845-000X Michael Bussmann  https://orcid.org/0000-0002-8258  ... 
doi:10.1088/1361-6587/abbe33 fatcat:vgbfknfodjetzpyx4gxj35vhym

Exascale Laser Plasma Physics - From Computational Speed to Predictions

Axel Huebl, René Widera, Marco Garten, Richard Pausch, Klaus Steiniger, Sergei Bastrakov, Felix Meyer, Ksenia Bastrakova, Alexander Debus, Simeon Ehrig, Matthias Werner, Benjamin Worpitz (+7 others)
2019 Zenodo  
Invited presentation given by Axel Huebl (HZDR, Germany) at the EPS 2019 conference in Milano (Italy) on July 11th, 2019.
doi:10.5281/zenodo.3332813 fatcat:6xlq64nycjgh3acn52xjs5hn2i

Alpaka -- An Abstraction Library for Parallel Kernel Acceleration

Erik Zenker, Benjamin Worpitz, Rene Widera, Axel Huebl, Guido Juckeland, Andreas Knupfer, Wolfgang E. Nagel, Michael Bussmann
2016 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)  
Porting applications to new hardware or programming models is a tedious and error prone process. Every help that eases these burdens is saving developer time that can then be invested into the advancement of the application itself instead of preserving the status-quo on a new platform. The Alpaka library defines and implements an abstract hierarchical redundant parallelism model. The model exploits parallelism and memory hierarchies on a node at all levels available in current hardware. By
more » ... so, it allows to achieve platform and performance portability across various types of accelerators by ignoring specific unsupported levels and utilizing only the ones supported on a specific accelerator. All hardware types (multi- and many-core CPUs, GPUs and other accelerators) are supported for and can be programmed in the same way. The Alpaka C++ template interface allows for straightforward extension of the library to support other accelerators and specialization of its internals for optimization. Running Alpaka applications on a new (and supported) platform requires the change of only one source code line instead of a lot of \#ifdefs.
doi:10.1109/ipdpsw.2016.50 dblp:conf/ipps/ZenkerWWHJKNB16 fatcat:zgvpbvbeeneuznpica7l6h2owm

Challenges Porting a C++ Template-Metaprogramming Abstraction Layer to Directive-based Offloading [article]

Jeffrey Kelling, Sergei Bastrakov, Alexander Debus, Thomas Kluge, Matt Leinhauser, Richard Pausch, Klaus Steiniger, Jan Stephan, René Widera, Jeff Young, Michael Bussmann, Sunita Chandrasekaran (+1 others)
2022 arXiv   pre-print
HPC systems employ a growing variety of compute accelerators with different architectures and from different vendors. Large scientific applications are required to run efficiently across these systems but need to retain a single code-base in order to not stifle development. Directive-based offloading programming models set out to provide the required portability, but, to existing codes, they themselves represent yet another API to port to. Here, we present our approach of porting the
more » ... ated particle-in-cell code PIConGPU to OpenACC and OpenMP target by adding two new backends to its existing C++-template metaprogramming-based offloading abstraction layer alpaka and avoiding other modifications to the application code. We introduce our approach in the face of conflicts between requirements and available features in the standards as well as practical hurdles posed by immature compiler support.
arXiv:2110.08650v2 fatcat:65k2l6te6baldcjhsqjcvbgupq
« Previous Showing results 1 — 15 out of 73 results