Filters








27 Hits in 3.8 sec

Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Largescale Multithreaded BlueGene/Q Supercomputer

Xingfu Wu, Valerie Taylor
2013 International Journal of Networked and Distributed Computing (IJNDC)  
, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded BlueGene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting  ...  Many/multi-core supercomputers provide a natural programming paradigm for hybrid MPI/OpenMP scientific applications.  ...  The authors would like to acknowledge Argonne Leadership Computing Facility for the use of BlueGene/Q under DOE INCITE project "Performance Evaluation and Analysis Consortium End Station" and BGQ Tools  ... 
doi:10.2991/ijndc.2013.1.4.3 fatcat:vhgrrcqzijfs7buf3x2pvqmboq

Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer

Xingfu Wu, Valerie Taylor
2013 2013 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing  
, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded BlueGene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting  ...  Many/multi-core supercomputers provide a natural programming paradigm for hybrid MPI/OpenMP scientific applications.  ...  The authors would like to acknowledge Argonne Leadership Computing Facility for the use of BlueGene/Q under DOE INCITE project "Performance Evaluation and Analysis Consortium End Station" and BGQ Tools  ... 
doi:10.1109/snpd.2013.81 dblp:conf/snpd/WuT13 fatcat:qcv3ya6rafhhhoiisv3asken5a

Utilizing Hardware Performance Counters to Model and Optimize the Energy and Performance of Large Scale Scientific Applications on Power-Aware Supercomputers

Xingfu Wu, Valerie Taylor
2016 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)  
The counter-guided optimizations result in a reduction in energy by an average of 18.28% on up to 32,768 cores on Mira and 11.28% on up to 128 cores on SystemG for the aerospace application.  ...  The performance counters that compose the models are used to explore some counterguided optimizations with two large-scale scientific applications: an earthquake simulation and an aerospace application  ...  Wallace from Illinois Institute of Technology for providing the latest MonEQ, and Argonne Leadership Computing Facility for the use of BlueGene/Q Mira under DOE INCITE project PEACES.  ... 
doi:10.1109/ipdpsw.2016.78 dblp:conf/ipps/WuT16 fatcat:qmlnku6wbbapzmd43vofvasegm

waLBerla: A block-structured high-performance framework for multiphysics simulations [article]

Martin Bauer, Sebastian Eibl, Christian Godenschwager, Nils Kohl, Michael Kuron, Christoph Rettinger, Florian Schornbaum, Christoph Schwarzmeier, Dominik Thönnes, Harald Köstler, Ulrich Rüde
2019 arXiv   pre-print
We present several example applications realized with waLBerla, ranging from lattice Boltzmann methods to rigid particle simulations.  ...  The framework uses meta-programming techniques to generate highly efficient code for CPUs and GPUs from a symbolic method formulation.  ...  Acknowledgements The authors thank Daniela Anderl, Dominik Bartuschat As waLBerla is designed for massively parallel high performance computing, access to supercomputing facilities is of essential importance  ... 
arXiv:1909.13772v1 fatcat:b2iwdbbugjeebk3diazleysedi

Parallelised Hoshen–Kopelman algorithm for lattice-Boltzmann simulations

S. Frijters, T. Krüger, J. Harting
2015 Computer Physics Communications  
We present our parallel implementation, which is tailored to a common parallelisation scheme for the lattice-Boltzmann method, and compare it to previous work.  ...  The common factor in these seemingly disparate subjects is that both represent an opportunity for a novel application of the Hoshen-Kopelman algorithm.  ...  We thank the Jülich Supercomputing Centre and HLRS Stuttgart for the allocation of computation time.  ... 
doi:10.1016/j.cpc.2014.12.014 fatcat:qmmllvvgjvgu5pnsjdb7cf7f7q

A Solver for Massively Parallel Direct Numerical Simulation of Three-Dimensional Multiphase Flows [article]

S. Shin, J. Chergui, D. Juric
2014 arXiv   pre-print
The code is wholly written by the authors in Fortran 2003 and uses a domain decomposition strategy for parallelization with MPI.  ...  The code includes modules for flow interaction with immersed solid objects, contact line dynamics, species and thermal transport with phase change.  ...  ACKNOWLEDGEMENTS This work was supported by: (1) the Partnership for Advanced Computing in Europe (PRACE),  ... 
arXiv:1410.8568v1 fatcat:o5ear43lsvgndaux4e4tiiuhmu

Identification/Selection Of E-Cam Meso And Multi-Scale Modelling Codes For Development

Ignacio Pagonabarraga, Burkhard Duenweg, Vladimir Lobaskin, Godehard Sutmann, Carsten Hartmann, Leon Petit
2016 Zenodo  
The other two approaches (MSM, MOR) do not need a specific code, but, because they use trajectories, they need an optimal interface with the MD code.  ...  optimal performance of codes on next generation hardware.  ... 
doi:10.5281/zenodo.841696 fatcat:ylbgdlavzrewli33dtdz3nkzdi

Massively Parallel Phase-Field Simulations for Ternary Eutectic Directional Solidification [article]

Martin Bauer, Johannes Hötzer, Philipp Steinmetz, Marcus Jainta, Marco Berghoff, Florian Schornbaum, Christian Godenschwager, Harald Köstler, Britta Nestler, Ulrich Rüde
2015 arXiv   pre-print
We apply various optimization techniques, including buffering techniques, explicit SIMD kernel vectorization, and communication hiding.  ...  For a realistic simulation, we use the well established thermodynamically consistent phase-field method and improve it with a new grand potential formulation to couple the concentration evolution.  ...  All systems support vectorization, either using AVX(2) on the Intel based chips or QPX on BlueGene/Q cores.  ... 
arXiv:1506.01684v1 fatcat:equ5plm3qjdxdivlaqmqiveule

Massively parallel multicanonical simulations

Jonathan Gross, Johannes Zierenberg, Martin Weigel, Wolfhard Janke
2018 Computer Physics Communications  
We provide the fully documented source code for the approach applied to the paradigmatic example of the two-dimensional Ising model as starting point and reference for practitioners in the field.  ...  resources for multicanonical simulations.  ...  Acknowledgments We would like to thank Marco Mueller for fruitful discussions. Part of this work has been financially sup-  ... 
doi:10.1016/j.cpc.2017.10.018 fatcat:mmczczphabcebcz3tnjcwyirjm

E-Cam Software Porting And Benchmarking Data I

Alan O'Cais, Liang Liang, Jony Castagna
2017 Zenodo  
These architectures included Cray XC, Bluegene/Q and clusters systems (with Xeon Phi or GPU accelerators on various systems).  ...  Ludwig is a Lattice Boltzmann code used for the simulation of complex fluids. It is open source and available from the Ludwig svn repository. The version tested is 0.4.6.  ... 
doi:10.5281/zenodo.1191427 fatcat:wvtsup3765gpbigz7bw2x4pt3q

Optimizing the performance of streaming numerical kernels on the IBM Blue Gene/P PowerPC 450 processor

Tareq Malas, Aron J. Ahmadia, Jed Brown, John A. Gunnels, David E. Keyes
2012 The international journal of high performance computing applications  
We propose a new method for constructing streaming numerical kernels using a high-level assembly synthesis and optimization framework.  ...  We describe an implementation of this method in Python targeting the IBM Blue Gene/P supercomputer's PowerPC 450 core.  ...  Acknowledgments We are grateful to Andy Ray Terrel for his helpful commentary on an early draft of this paper.  ... 
doi:10.1177/1094342012444795 fatcat:ze2oemqa5fcxbbsdgelpbgzf74

Lattice–Boltzmann simulations for complex geometries on high-performance computers

Andreas Lintermann, Wolfgang Schröder
2020 CEAS Aeronautical Journal  
Simulations on large-scale meshes are performed by a high-scaling lattice-Boltzmann method using the second-order accurate interpolated bounce-back boundary conditions for no-slip walls.  ...  Such simulations, furthermore, presume optimized scalability on high-performance computers to solve highdimensional physical problems in an adequate time.  ...  A promising numerical method for the efficient simulation of flows are lattice-Boltzmann (LB) methods [3, 27, 31] .  ... 
doi:10.1007/s13272-020-00450-1 fatcat:tblfyydgyjgbfcztqjs5egp5gq

Zonal Flow Solver (ZFS): a highly efficient multi-physics simulation framework

Andreas Lintermann, Matthias Meinke, Wolfgang Schröder
2020 International journal of computational fluid dynamics (Print)  
Therefore, the framework Zonal Flow Solver (ZFS) featuring lattice-Boltzmann, finite-volume, discontinuous Galerkin, level set and Lagrange solvers has been developed.  ...  As a consequence, numerical codes need to run efficiently on high-performance computers.  ...  CODE_Saturne relies on a FV method and scales up to 786,432 cores on BlueGene/Q machines. 1 In contrast, MOOSE relies on fully implicit methods and scales up to 32, 768 cores with an efficiency of 77%  ... 
doi:10.1080/10618562.2020.1742328 fatcat:qgdstnw3cbbmjjmmt7cuiogyaq

Building and utilizing fault tolerance support tools for the GASPI applications

Faisal Shahzad, Moritz Kreutzer, Thomas Zeiser, Rui Machado, Andreas Pieper, Georg Hager, Gerhard Wellein
2016 The international journal of high performance computing applications  
This results in the decrease of mean time to failures (MTTF) of the systems with every newer generation, which is an alarming trend.  ...  Education and Research (BMBF) under project "A Fault Tolerant Environment for Peta-scale MPI-solvers" (FETOL) (grant No. 01IH11011C).  ...  Acknowledgements This work was partly supported by the German Research Foundation (DFG) through the Priority Programme 1648 "Software for Exascale Computing" (SPPEXA) and partly by Federal Ministry of  ... 
doi:10.1177/1094342016677085 fatcat:ibbg4ldgdjcn5ifv2jue6tdome

Investigating power capping toward energy-efficient scientific applications

Azzam Haidar, Heike Jagode, Phil Vaccaro, Asim YarKhan, Stanimire Tomov, Jack Dongarra
2018 Concurrency and Computation  
In this paper, we discuss strategies for power measurement and power control to offer scientific application developers the basic building blocks required to develop dynamic optimization strategies while  ...  Analyzing how hardware components consume power at run time is key in determining which of the aforementioned categories an application fits into.  ...  For example, on AMD Family 15h machines, power can be accessed with the PAPI lmsensors component; for IBM BlueGene/Q, PAPI provides the emon component; and for NVIDIA GPUs, PAPI provides the nvml component  ... 
doi:10.1002/cpe.4485 fatcat:k7iv4hplx5fqzne6kupmpjharq
« Previous Showing results 1 — 15 out of 27 results