Filters








176 Hits in 2.9 sec

Accelerating a C++ CFD Code with OpenACC

Jiri Kraus, Michael Schlottke, Andrew Adinetz, Dirk Pleiter
2014 2014 First Workshop on Accelerator Programming using Directives  
Taking the C++ flow solver ZFS as an example, we show that the directive-based programming model allows one to achieve good performance with reasonable effort, even for mature codes with many lines of  ...  For the kernel most affected by the memory access pattern, we compare the initial array of structures memory layout with a structure of arrays layout.  ...  ACKNOWLEDGEMENTS This work has been carried out in the scope of the NVIDIA Application Lab at Jülich in collaboration with the JARA-HPC SimLab Fluids & Solids Engineering and the Institute of Aerodynamics  ... 
doi:10.1109/waccpd.2014.11 dblp:conf/sc/KrausSAP14 fatcat:jn42zcof7bczhl4uajlpljblue

Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption [article]

Suejb Memeti and Lu Li and Sabri Pllana and Joanna Kolodziej and Christoph Kessler
2017 arXiv   pre-print
a GPU accelerator or an Intel Xeon Phi co-processor.  ...  To evaluate the programming productivity we use our homegrown tool CodeStat, which enables us to determine the percentage of code lines that was required to parallelize the code using a specific framework  ...  OCL-Ida CUDA-Ida OMP-Emil OCL-Ida CUDA-Ida CFD Hotspot LUD Energy [J] CPU Accelerator (b) Energy Consumption Figure 4: A comparison of OpenMP, OpenCL, and CUDA with respect to (a)  ... 
arXiv:1704.05316v1 fatcat:lax3kghaxnanxixklx3haavlxa

Directive-based GPU programming for computational fluid dynamics

Brent P. Pickering, Charles W. Jackson, Thomas R.W. Scogland, Wu-Chun Feng, Christopher J. Roy
2015 Computers & Fluids  
We examine the process of applying the OpenACC Fortran API to a test CFD code that serves as a proxy for a full-scale research code developed at Virginia Tech; this test code is used to asses the performance  ...  Directive-based programming OpenACC Fortran Finite-difference method a b s t r a c t Directive-based programming of graphics processing units (GPUs) has recently appeared as a viable alternative to using  ...  Acknowledgments This work was supported by an Air Force Office of Scientific Research (AFOSR) Basic Research Initiative in the Computational Mathematics program with Dr.  ... 
doi:10.1016/j.compfluid.2015.03.008 fatcat:guzjdb7llnbsbosoczysbe3r7y

An Early Performance Comparison of CUDA and OpenACC

Xuechao Li, Po-Chou Shih, H. Bevrani, W. Shuhui
2018 MATEC Web of Conferences  
Overall we found that OpenACC is a reliable programming model and a good alternative to CUDA for accelerator devices.  ...  The results show that in terms of kernel running time, the OpenACC performance is lower than the CUDA performance because PGI compiler needs to translate OpenACC kernels into object code while CUDA codes  ...  In OpenACC, porting of legacy CPU-base code only requires to add several lines of annotations before the sections where they need to be accelerated, without changing code structures [2] .  ... 
doi:10.1051/matecconf/201820805002 fatcat:ul6gcqz3jrgqrallnkoqjjlqie

Recent progress and challenges in exploiting graphics processors in computational fluid dynamics

Kyle E. Niemeyer, Chih-Jen Sung
2013 Journal of Supercomputing  
Finally, recommendations for implementing CFD codes on GPUs are given and remaining challenges are discussed, such as the need to develop new strategies and redesign algorithms to enable GPU acceleration  ...  simple codes.  ...  Another avenue for accelerating applications using GPUs is OpenACC [78, 91] , which uses compiler directives (e.g., #pragma) placed in Fortran, C, and C ++ codes to identify sections of code to be run  ... 
doi:10.1007/s11227-013-1015-7 fatcat:jyjmfa7wqzgnjf3qakgk6cyadi

Energy Applications Challenges (SIAM CSE21) [article]

Thomas Evans
2021 figshare.com  
These areas include wind power, combustion, nuclear energy, carbon capture, fusion energy,and plasma accelerators.  ...  particle transport that, due to its stochastic nature, does not directly map to SIMT architectures.Multiple programming models and approaches are being used to achieve performance portability across a  ...  • Resolving electromagnetic turbulence • Coupling numerics between core and edge codes WarpX Challenge Problem and Codes Modeling of a chain of tens of plasma acceleration stages resulting  ... 
doi:10.6084/m9.figshare.14125667.v2 fatcat:gvh4wndrrbh7vcgk7ily24fmdi

Energy Applications Challenges (SIAM CSE21) [article]

Thomas Evans
2021 figshare.com  
These areas include wind power, combustion, nuclear energy, carbon capture, fusion energy,and plasma accelerators.  ...  particle transport that, due to its stochastic nature, does not directly map to SIMT architectures.Multiple programming models and approaches are being used to achieve performance portability across a  ...  • Resolving electromagnetic turbulence • Coupling numerics between core and edge codes WarpX Challenge Problem and Codes Modeling of a chain of tens of plasma acceleration stages resulting  ... 
doi:10.6084/m9.figshare.14125667.v3 fatcat:jiazayqlsjgbpaxe7hiolhukrq

Multi-GPU Performance Optimization of a CFD Code using OpenACC on Different Platforms [article]

Weicheng Xue, Christopher J. Roy
2020 arXiv   pre-print
This paper investigates the multi-GPU performance of a 3D buoyancy driven cavity solver using MPI and OpenACC directives on different platforms.  ...  Since the buoyancy driven cavity code is latency-bounded on the clusters examined, a series of optimizations both agnostic and tailored to the platforms are designed to reduce the latency cost and improve  ...  McCall and Behzad Baghapour for creating the original BDC code as well as giving advice, and thank Charles W. Jackson for reviewing the paper and participating in various helpful discussions.  ... 
arXiv:2006.02602v1 fatcat:vkldeh3tqfhyfovny27go7zx5y

Accelerating Hydrocodes with OpenACC, OpenCL and CUDA

J. A. Herdman, W. P. Gaudin, S. McIntosh-Smith, M. Boulton, D. A. Beckingsale, A. C. Mallinson, S. A. Jarvis
2012 2012 SC Companion: High Performance Computing, Networking Storage and Analysis  
We find that OpenACC is an extremely viable programming model for accelerator devices, improving programmer productivity and achieving better performance than OpenCL and CUDA.  ...  , and portability using a recently developed Lagrangian-Eulerian explicit hydrodynamics mini-application.  ...  The authors would like to express their thanks to Cray, in particular Alistair Hart of the Cray European Exascale Research Initiative, for their help with OpenACC and also to John Pennycook of the University  ... 
doi:10.1109/sc.companion.2012.66 dblp:conf/sc/HerdmanGMBBMJ12 fatcat:hu77tqzljrgjfbdcwsoonjz5yu

JACC: An OpenACC Runtime Framework with Kernel-Level and Multi-GPU Parallelization [article]

Kazuaki Matsumura, Simon Garcia De Gonzalo, Antonio J. Peña
2021 arXiv   pre-print
Efforts on such models involve a least engineering cost for enabling computational acceleration on multiple architectures while programmers are only required to add meta information upon sequential code  ...  This paper introduces JACC, an OpenACC runtime framework which enables the dynamic extension of OpenACC programs by serving as a transparent layer between the program and the compiler.  ...  On the other hand, JACC wraps OpenACC compilers and holds C/Fortran code for optimization. A few projects aim to assist code generation with directivebased programming.  ... 
arXiv:2110.14340v1 fatcat:acfa6g7xm5dyfajen7fqkn4yri

NAS Parallel Benchmarks for GPGPUs Using a Directive-Based Programming Model [chapter]

Rengan Xu, Xiaonan Tian, Sunita Chandrasekaran, Yonghong Yan, Barbara Chapman
2015 Lecture Notes in Computer Science  
The right choice or combination of techniques/hints are crucial for compilers to generate highly efficient codes tuned to a particular type of accelerator.  ...  such as OpenACC.  ...  For evaluation purposes, we compare the performances of our OpenACC programs with serial and third-party well tuned OpenCL Benchmark EP CG FT IS Data Size A B C A B C A B C A B C NPB-SER  ... 
doi:10.1007/978-3-319-17473-0_5 fatcat:25gloejzqzeetcxmaa4cxctkwu

Parallel Reservoir Simulation with OpenACC and Domain Decomposition

Zhijiang Kang, Ze Deng, Wei Han, Dongmei Zhang
2018 Algorithms  
In order to address the problems, we propose a parallel method with OpenACC to accelerate serial code and reduce the time and effort during porting an application to GPU.  ...  The experimental results indicate that (1) the proposed GPU-aided approach can outperform the CPU-based one up to about two times, meanwhile with the help of OpenACC, the workload of the transplant code  ...  Like OpenMP, its benchmark is for C/C++ and Fortran source code to identify the areas that should be accelerated using compiler directives and additional functions.  ... 
doi:10.3390/a11120213 fatcat:swrmqhkzujc3fjvozi3jckd7rm

An Improved Framework of GPU Computing for CFD Applications on Structured Grids using OpenACC [article]

Weicheng Xue, Charles W. Jackson, Christoper J. Roy
2020 arXiv   pre-print
This paper is focused on improving multi-GPU performance of a research CFD code on structured grids. MPI and OpenACC directives are used to scale the code up to 16 GPUs.  ...  A series of performance issues related to the scaling for the multi-block CFD code are addressed by applying various optimizations.  ...  [24] further applied the heterogeneous computing to accelerate a complicated CFD code on a CPU/GPU platform using MPI and OpenACC.  ... 
arXiv:2012.02925v1 fatcat:evqp7zr7afebhksc2iqexs4ewy

D7.2.2 Exploitation of HPC Tools and Techniques

Michael Lysaght, Bjorn Lindi, Vit Vondrak, John Donners, Marc Tajchman
2014 Zenodo  
For a more detailed description of each of the exploitation projects summarised here, we refer the reader to the PRACE-3IP whitepaper associated with each of the 17 projects.  ...  The objective of PRACE-3IP Work Package 7 (WP7) 'Application Enabling and Support' is to provide applications enabling support for HPC applications codes which are important for European researchers to  ...  With OpenACC, a developer can annotate C, C++ and Fortran source code to identify the areas to be accelerated using #pragma compiler directives and additional functions.  ... 
doi:10.5281/zenodo.6575525 fatcat:5y3cjsculrdejllndosbjpcgiq

Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems [article]

Jonathan Vincent, Jing Gong, Martin Karp, Adam Peplinski, Niclas Jansson, Artur Podobas, Andreas Jocksch, Jie Yao, Fazle Hussain, Stefano Markidis, Matts Karlsson, Dirk Pleiter (+2 others)
2021 arXiv   pre-print
We present new results on the strong parallel scaling for the OpenACC-accelerated implementation of the high-order spectral element fluid dynamics solver Nek5000.  ...  The performance results show that speed-up between 3-5 can be achieved using the GPU accelerated version compared with the CPU version on these different systems.  ...  The acceleration of Nek5000 with OpenACC directives was first explored by Markidis et al. by accelerating the mini-app Nekbone in [21] and then improved with CUDA Fortran implementations for the core  ... 
arXiv:2109.03592v3 fatcat:6e75xxahnfhpxn3plml6lradf4
« Previous Showing results 1 — 15 out of 176 results