1,089 Hits in 9.7 sec

Modeling and analyzing performance for highly optimized propagation steps of the lattice Boltzmann method on sparse lattices [article]

M. Wittmann, T. Zeiser, G. Hager, G. Wellein
2015 arXiv   pre-print
In this work we study performance optimizations for an MPI-parallel lattice Boltzmann-based flow solver that uses a sparse lattice representation with indirect addressing.  ...  Hence, performance optimization is a pivotal activity in this field of computational science. Not only does it reduce the time to solution, but it also allows to minimize the energy consumption.  ...  We would like to thank the Leibniz Computing Center (LRZ) in Garching, Germany for the great support during conducting the benchmarks.  ... 
arXiv:1410.0412v2 fatcat:m6ogwsit2vcwvebz2ozjlynhhy

Sparse Geometries Handling in Lattice Boltzmann Method Implementation for Graphic Processors

Tadeusz Tomczak, Roman G. Szafran
2018 IEEE Transactions on Parallel and Distributed Systems  
We describe a high-performance implementation of the lattice-Boltzmann method (LBM) for sparse geometries on graphic processors.  ...  We reached the performance of 682 MLUPS on GTX Titan (72\% of peak theoretical memory bandwidth) for D3Q19 lattice arrangement and double precision data.  ...  N N501 042140 and from the Wroclaw University of Science and Technology, Faculty of Electronics, Chair of Computer Engineering statutory funds.  ... 
doi:10.1109/tpds.2018.2810237 fatcat:h3mpnkq3cfhcxbyzqawkotg4ia

Chip-level and multi-node analysis of energy-optimized lattice Boltzmann CFD simulations

Markus Wittmann, Georg Hager, Thomas Zeiser, Jan Treibig, Gerhard Wellein
2015 Concurrency and Computation  
We choose the lattice-Boltzmann method (LBM) on an Intel Sandy Bridge cluster as a prototype scenario to investigate if and how single-chip performance and power characteristics can be generalized to the  ...  First we perform an analysis of a sparse-lattice LBM implementation for complex geometries.  ...  Acknowledgments We gratefully acknowledge the support by LRZ Supercomputing Center. This work was partially supported by BMBF under grant No. 01IH08003A (project SKALB).  ... 
doi:10.1002/cpe.3489 fatcat:rq4nswskabctrlhn4sfe6tsflm

Software and DVFS Tuning for Performance and Energy-Efficiency on Intel KNL Processors

Enrico Calore, Alessandro Gabbana, Sebastiano Schifano, Raffaele Tripiccione
2018 Journal of Low Power Electronics and Applications  
As a benchmark application we use a Lattice Boltzmann code heavily optimized for this architecture, and implemented using several different arrangements of the application data in memory (data-layouts,  ...  In this work we focus on computing and energy performance of the Knights Landing Xeon Phi, the latest Intel many-core architecture processor for HPC applications.  ...  the design and implementation of the different data-structures and to the analysis of computing performances; and R.T. has designed and developed the Lattice Boltzmann algorithm.  ... 
doi:10.3390/jlpea8020018 fatcat:h6wevknrebfwxaqvcwrbywagja

GPU Acceleration of the HemeLB Code for Lattice Boltzmann Simulations in Sparse Complex Geometries

Benjamin T. Shealy, Mehrdad Yousefi, Ashwin T. Srinath, Melissa C. Smith, Ulf D. Schiller.
2021 IEEE Access  
We present an implementation and scaling analysis of a GPU-accelerated kernel for HemeLB, a high-performance Lattice Boltzmann code for sparse complex geometries.  ...  It is expected that the GPU implementation will enable users of the HemeLB code to make better utilization of heterogeneous high-performance computing systems for large-scale lattice Boltzmann simulations  ...  U.S. thanks Rupert Nash, Derek Groen, and Peter Coveney for valuable discussions. Clemson University is acknowledged for generous allotment of compute time on the Palmetto cluster.  ... 
doi:10.1109/access.2021.3073667 fatcat:eq2veh6lnjcaraywsieqideq4u

Advanced Performance Analysis of HPC Workloads on Cavium ThunderX

Enrico Calore, Filippo Mantovani, Daniel Ruiz
2018 2018 International Conference on High Performance Computing & Simulation (HPCS)  
To show the possibilities offered by the available tools, we provide as an example, the analysis of a Lattice Boltzmann HPC production code, highly optimized for several architectures and now ported also  ...  In particular, as performance analysis tools we adopt Extrae and Paraver, making use of the PAPI support, initially developed by us for the ThunderX platform, and now available also upstream.  ...  Cavium Inc. has kindly supported this research providing access to documentation and platforms.  ... 
doi:10.1109/hpcs.2018.00068 dblp:conf/ieeehpcs/CaloreMR18 fatcat:32zs7aszffeo7mcjcxougqe6fe

Dynamics and performance of susceptibility propagation on synthetic data

E. Aurell, C. Ollion, Y. Roudi
2010 European Physical Journal B : Condensed Matter Physics  
We study the performance and convergence properties of the Susceptibility Propagation (SusP) algorithm for solving the Inverse Ising problem.  ...  We then show that dense connectivity, loopiness of the connectivity, and high absolute magnetization all have deteriorating effects on the performance of the algorithm.  ...  Boltzmann Learning [1] is an iterative method where in one step the correlation functions are computed given an Ising model, and in another step the Ising model couplings are modified to adjust to data  ... 
doi:10.1140/epjb/e2010-00277-0 fatcat:lzsx2ffc3nbk3eokh52z4g46cq

Performance and Power Analysis of HPC Workloads on Heterogenous Multi-Node Clusters

Filippo Mantovani, Enrico Calore
2018 Journal of Low Power Electronics and Applications  
Performance analysis tools allow application developers to identify and characterize the inefficiencies that cause performance degradation in their codes, allowing for application optimizations.  ...  In particular, we show how the same analysis techniques can be applicable on different architectures, analyzing the same HPC application on a high-end and a low-power cluster.  ...  We thank the University of Ferrara and INFN Ferrara for the access to the COKA Cluster.  ... 
doi:10.3390/jlpea8020013 fatcat:etpm56cutzhuhfhbtjprmhv5bu

A high-performance lattice Boltzmann implementation to model flow in porous media

Chongxun Pan, Jan F. Prins, Cass T. Miller
2004 Computer Physics Communications  
We examine the problem of simulating single and multiphase flow in porous medium systems at the pore scale using the lattice Boltzmann (LB) method.  ...  We investigate a two-stage implementation consisting of a sparse domain decomposition stage and a simulation stage that avoids the need to store and operate on lattice points located within a solid phase  ...  We thank Mark Reed for useful discussions and assistance with MPI implementations of our codes.  ... 
doi:10.1016/j.cpc.2003.12.003 fatcat:ozwoos2rcjfy3o3gjexbws3hry

Estimation of a semi-physical GLBE model using dual EnKF learning algorithm coupled with a sensor network design strategy: Application to air field monitoring

Gilles Roussel, Laurent Bourgois, Mohammed Benjelloun, Gilles Delmaire
2013 Information Fusion  
Finally, we proposed to complete the lack of spatial information of the sparse-observation network by adding a mobile sensor, which was routed to the location where the cell-by-cell output estimation error  ...  Experimental results in the context of the standard lid-driven cavity problem revealed the presence of few zones of interest, where fixed sensors can be deployed to increase performances in terms of convergence  ...  Lattice Boltzmann model General framework The LBM is a numerical method based on Boltzmann kinetic theory and can be expressed in terms of the probability to find a fluid particle, in the vicinity of  ... 
doi:10.1016/j.inffus.2013.03.001 fatcat:haykosajyjc5fj6oq7kjosjzcu

Memory-efficient Lattice Boltzmann Method for low Reynolds number flows [article]

Maciej Matyka
2019 arXiv   pre-print
The Lattice Boltzmann Method algorithm is simplified by assuming constant numerical viscosity (the relaxation time is fixed at τ=1).  ...  This leads to the removal of the distribution function from the computer memory. To test the solver the Poiseuille and Driven Cavity flows are simulated and analyzed.  ...  The Model The Lattice Boltzmann Method use the multi-dimensional velocity distribution f k (x, t) to describe the state of the fluid.  ... 
arXiv:1912.09327v1 fatcat:d5x62wd6gne4tc6dsqo5risksa

waLBerla: A block-structured high-performance framework for multiphysics simulations [article]

Martin Bauer, Sebastian Eibl, Christian Godenschwager, Nils Kohl, Michael Kuron, Christoph Rettinger, Florian Schornbaum, Christoph Schwarzmeier, Dominik Thönnes, Harald Köstler, Ulrich Rüde
2019 arXiv   pre-print
Multiple levels of parallelism on the core, on the compute node, and between nodes need to be exploited to make full use of the system.  ...  The framework uses meta-programming techniques to generate highly efficient code for CPUs and GPUs from a symbolic method formulation.  ...  Acknowledgements The authors thank Daniela Anderl, Dominik Bartuschat As waLBerla is designed for massively parallel high performance computing, access to supercomputing facilities is of essential importance  ... 
arXiv:1909.13772v1 fatcat:b2iwdbbugjeebk3diazleysedi

Modeling Patient-Specific Magnetic Drug Targeting Within the Intracranial Vasculature

Alexander Patronis, Robin A. Richardson, Sebastian Schmieschek, Brian J. N. Wylie, Rupert W. Nash, Peter V. Coveney
2018 Frontiers in Physiology  
FIGURE 11 | Breakdown of metrics and efficiencies for HemeLB Simulate phase (operating on a voxelized representation of the circle of Willis model previously described) on ARCHER Cray XC30 (24 ranks per  ...  We demonstrate the excellent computational performance of our model by its application to the simulation of paramagnetic-nanoparticle-laden flows in a circle of Willis geometry obtained from an MRI scan  ...  We acknowledge funding support from the EU H2020 ACKNOWLEDGMENTS We thank Alberto Figueroa for providing the circle of Willis geometry, and Ulf Schiller, Derek Groen and Miguel Bernabeu for constructive  ... 
doi:10.3389/fphys.2018.00331 pmid:29725303 pmcid:PMC5917293 fatcat:5q2353kmjrdutdw7oyxoiw2shu

Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning

Samuel Williams, Leonid Oliker, Jonathan Carter, John Shalf
2011 Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11  
methods on forthcoming HPC systems.  ...  Results show that our unique tuning approach improves performance and energy requirements by up to 3.4× using 49,152 cores, while providing a portable optimization methodology for a variety of numerical  ...  LATTICE BOLTZMANN MODELS Variants of the Lattice Boltzmann equation have been ap-plied to problems such as fluid flows, flows in porous media, and turbulent flows over about the past 25 years [25] .  ... 
doi:10.1145/2063384.2063458 dblp:conf/sc/WilliamsOCS11 fatcat:e7snov63dvfklehytpcqbfb7xi

Efficient monolithic simulation techniques for the stationary Lattice Boltzmann equation on general meshes

T. Hübner, S. Turek
2009 Computing and Visualization in Science  
In this paper 1 , we present special discretization and solution techniques for the numerical simulation of the Lattice Boltzmann equation (LBE).  ...  Finally, we present quantitative comparisons between a highly optimized CFD-solver based on the Navier-Stokes equation (FeatFlow) and our new LBE solver (FeatLBE).  ...  Introduction In this paper we consider the Lattice Boltzmann equation (LBE), sometimes also referred to as discrete velocity model (DVM) in contrast to the Lattice Boltzmann method (LBM) which is thoroughly  ... 
doi:10.1007/s00791-009-0132-6 fatcat:7bmini6s7rc5ddryrkcrmlzd6i
« Previous Showing results 1 — 15 out of 1,089 results