Filters








69 Hits in 6.4 sec

Block-Relaxation Methods for 3D Constant-Coefficient Stencils on GPUs and Multicore CPUs [article]

Manuel Birke, Bobby Philip, Zhen Wang, Mark Berrill
<span title="2019-07-15">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We address this issue in the case of constant-coefficient stencils arising in the solution of elliptic partial differential equations on structured 3D uniform and adaptively refined grids.  ...  Developing robust and efficient algorithms suitable for current and evolving GPU and multicore CPU systems is a significant challenge.  ...  and Carl Ponder from NVIDIA Corporation for the useful discussions about CUDA.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1208.1975v3">arXiv:1208.1975v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/4irc6pc7xjhlridg5j7vslahrm">fatcat:4irc6pc7xjhlridg5j7vslahrm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200905173204/https://arxiv.org/pdf/1208.1975v2.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/11/22/1122edf52aeacbba75a8104aab16636fb814136f.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1208.1975v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Multi-dimensional intra-tile parallelization for memory-starved stencil computations [article]

Tareq Malas, Georg Hager, Hatem Ltaief, David Keyes
<span title="2015-10-16">2015</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We propose a flexible multi-dimensional intra-tile parallelization method for stencil algorithms on multicore CPUs with a shared outer-level cache.  ...  Girih shows substantial performance advantages and best arithmetic intensity at almost all problem sizes, especially on low-intensity stencils with variable coefficients.  ...  ACKNOWLEDGMENTS For computer time, this research used the resources of the Extreme Computing Research Center (ECRC) at KAUST. The authors thank the ECRC for supporting T. Malas.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1510.04995v1">arXiv:1510.04995v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/twbfi3zicbe7bdu3hgn7d37h7q">fatcat:twbfi3zicbe7bdu3hgn7d37h7q</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200930090311/https://arxiv.org/pdf/1510.04995v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/94/0e/940e26b54efa5383de328534725c469b40fdf6bb.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1510.04995v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Extendable pattern-oriented optimization directives

Huimin Cui, Jingling Xue, Lei Wang, Yang Yang, Xiaobing Feng, Dongrui Fan
<span title="2012-09-01">2012</span> <i title="Association for Computing Machinery (ACM)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/jfrn2kjyarhe7npmgvoxdp4cxu" style="color: black;">ACM Transactions on Architecture and Code Optimization (TACO)</a> </i> &nbsp;
We have identified and implemented a number of optimization patterns for three representative computer platforms.  ...  Current programming models and compiler technologies for multi-core processors do not exploit well the performance benefits obtainable by applying algorithm-specific, i.e., semantic-specific optimizations  ...  Fig. 14 : 14 Scalability of relaxed-stencil on x86SMP. Fig. 16 : 16 stencil on Godson-T. Fig. 18 : 18 dense-mm on GPU.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/2355585.2355587">doi:10.1145/2355585.2355587</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/zkx6ykcm3nf2ld3kzkcqk2btrq">fatcat:zkx6ykcm3nf2ld3kzkcqk2btrq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20120106124615/http://www.cse.unsw.edu.au/~jingling/papers/cgo11-cui.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/40/4e/404ea7d58eb39c237910f421b0a69f904032fbb8.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/2355585.2355587"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>

Extendable pattern-oriented optimization directives

Huimin Cui, Jingling Xue, Lei Wang, Yang Yang, Xiaobing Feng, Dongrui Fan
<span title="">2011</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/bmxgpqt325gxrkrxonhmytmhiu" style="color: black;">International Symposium on Code Generation and Optimization (CGO 2011)</a> </i> &nbsp;
We have identified and implemented a number of optimization patterns for three representative computer platforms.  ...  Current programming models and compiler technologies for multi-core processors do not exploit well the performance benefits obtainable by applying algorithm-specific, i.e., semantic-specific optimizations  ...  Fig. 14 : 14 Scalability of relaxed-stencil on x86SMP. Fig. 16 : 16 stencil on Godson-T. Fig. 18 : 18 dense-mm on GPU.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/cgo.2011.5764679">doi:10.1109/cgo.2011.5764679</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/cgo/CuiXWYFF11.html">dblp:conf/cgo/CuiXWYFF11</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cv2p4qu6xrhvldik65wfahl53q">fatcat:cv2p4qu6xrhvldik65wfahl53q</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20120106124615/http://www.cse.unsw.edu.au/~jingling/papers/cgo11-cui.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/40/4e/404ea7d58eb39c237910f421b0a69f904032fbb8.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/cgo.2011.5764679"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Parallel computations on GPU in 3D using the vortex particle method

Andrzej Kosior, Henryk Kudela
<span title="">2013</span> <i title="Elsevier BV"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/wp3g5bo4ojczdinvl7cavuwkq4" style="color: black;">Computers &amp; Fluids</a> </i> &nbsp;
The paper presented the Vortex in Cell (VIC) method for solving the fluid motion equations in 3D and its implementation for parallel computation in multicore architecture of the Graphics Processing Unit  ...  One of the most important components of the VIC method algorithm is the solution of the Poisson equation. Multigrid and full multigrid methods were chosen for its solution on GPU.  ...  Fig. 7 . 7 Velocity of the vortex ring as a function of circulation for calculation on CPU and GPU.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.compfluid.2012.01.014">doi:10.1016/j.compfluid.2012.01.014</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/xkug4xw2bzbc3ofrefri7gm7de">fatcat:xkug4xw2bzbc3ofrefri7gm7de</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170809103617/http://fluid.itcmp.pwr.wroc.pl/~znmp//Publikacje/CF.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ab/01/ab016a8dad632d07796575f57e4111b16c5f97ad.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.compfluid.2012.01.014"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> elsevier.com </button> </a>

Parallel solutions of static Hamilton-Jacobi equations for simulations of geological folds

Tor Gillberg, Are Bruaset, Øyvind Hjelle, Mohammed Sourouri
<span title="">2014</span> <i title="Springer Nature"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/wsm2dbrhwbfljbnu5qwtgqfsyq" style="color: black;">Journal of Mathematics in Industry</a> </i> &nbsp;
These algorithms are designed to work efficiently on different parallel computing architectures, and numerical results for multicore CPU and GPU implementations are reported and discussed.  ...  The performance of the new methods are investigate for tow types of static Hamilton-Jacobi formulations are investigated, the isotropic eikonal equation and an anisotropic formulation used to simulate  ...  For instance, the LAS speedup of 6.0 for Ex A running eight cores gives an efficiency of 6.0/8 ≈ 75%. b The CES software is only available on the Ivy platform running in double precision.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1186/2190-5983-4-10">doi:10.1186/2190-5983-4-10</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/56cj5kjg5zbdjityglurrudpk4">fatcat:56cj5kjg5zbdjityglurrudpk4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170808114620/https://mathematicsinindustry.springeropen.com/track/pdf/10.1186/2190-5983-4-10?site=mathematicsinindustry.springeropen.com" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/31/f8/31f81eec1fb987c5b848042c1a38ae7407fa9df6.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1186/2190-5983-4-10"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> springer.com </button> </a>

Adaptive kinetic-fluid solvers for heterogeneous computing architectures

Sergey Zabelok, Robert Arslanbekov, Vladimir Kolobov
<span title="">2015</span> <i title="Elsevier BV"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/f7kkigxhzjakref5b42jt4uoya" style="color: black;">Journal of Computational Physics</a> </i> &nbsp;
Double digit speedups on single GPU and good scaling for multi-GPUs have been demonstrated.  ...  Challenges are due to the irregular data access for adaptive Cartesian mesh, vast difference of computational cost between kinetic and fluid cells, and desire to evenly load all CPUs and GPUs during grid  ...  We wish to thank Dr Martin Burtscher for useful discussions and suggestion of the warp algorithm for the LBM solver. Thanks to an anonymous reviewer for useful suggestions for improving the paper.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.jcp.2015.10.003">doi:10.1016/j.jcp.2015.10.003</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cce7msllzbfhbp5jttoekob7su">fatcat:cce7msllzbfhbp5jttoekob7su</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180724221606/https://manuscript.elsevier.com/S0021999115006658/pdf/S0021999115006658.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/18/67/18676b1940d24edc608f2a7acbd814bd0800c3be.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.jcp.2015.10.003"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> elsevier.com </button> </a>

Accelerating leukocyte tracking using CUDA: A case study in leveraging manycore coprocessors

M. Boyer, D. Tarjan, S.T. Acton, K. Skadron
<span title="">2009</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/t3x4vqewrncrfgn2wu7cafsbsq" style="color: black;">2009 IEEE International Symposium on Parallel &amp; Distributed Processing</a> </i> &nbsp;
The availability of easily programmable manycore CPUs and GPUs has motivated investigations into how to best exploit their tremendous computational power for scientific computing.  ...  Here we demonstrate how a systems biology application-detection and tracking of white blood cells in video microscopy-can be accelerated by 200x using a CUDA-capable GPU.  ...  The authors would like to thank Saurav Basu for his help with the original MATLAB implementation, as well as Leo Wolpert, Donald Carter, and Drew Gilliam for their prior work on implementing the leukocyte  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ipdps.2009.5160984">doi:10.1109/ipdps.2009.5160984</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/ipps/BoyerTAS09.html">dblp:conf/ipps/BoyerTAS09</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ipxsq5igtveo7m47nffsg52k7y">fatcat:ipxsq5igtveo7m47nffsg52k7y</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20130728011535/http://www.cs.virginia.edu/~skadron/Papers/boyer_leukocyte_ipdps09.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/d1/20/d1202ca5b02a3d862dbef9ef3778a720300c0eae.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ipdps.2009.5160984"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Matrix-free GPU implementation of a preconditioned conjugate gradient solver for anisotropic elliptic PDEs

Eike Müller, Xu Guo, Robert Scheichl, Sinan Shi
<span title="">2013</span> <i title="Springer Nature"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/jz3q3abd2nexhbsywjnhqvqqia" style="color: black;">Computing and Visualization in Science</a> </i> &nbsp;
In this article we describe the GPU implementation and optimisation of a Preconditioned Conjugate Gradient (PCG) algorithm for the solution of a three dimensional anisotropic elliptic PDE for the pressure  ...  Graphics Processing Units (GPUs) have been shown to be highly efficient (both in terms of absolute performance and power consumption) for a wide range of applications in scientific computing, and recently  ...  The numerical experiments in this work were carried out on a node of the aquila supercomputer at the University of Bath and we are grateful to Steven Chapman for his continuous and tireless technical support  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s00791-014-0223-x">doi:10.1007/s00791-014-0223-x</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/66fe4fa4kfbezo462oe5yen3rm">fatcat:66fe4fa4kfbezo462oe5yen3rm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20171129173029/https://core.ac.uk/download/pdf/38144712.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/24/09/2409de10d42d0b39ebde057e3b96032605528a96.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s00791-014-0223-x"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Matrix-free GPU implementation of a preconditioned conjugate gradient solver for anisotropic elliptic PDEs [article]

Eike Mueller, Xu Guo, Robert Scheichl, Sinan Shi
<span title="2013-02-28">2013</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We demonstrate the performance of our matrix-free GPU code by comparing it to a sequential CPU implementation and to a matrix-explicit GPU code which uses existing libraries.  ...  Global memory access can also be reduced by rewriting the algorithm using loop fusion and we show that this further reduces the runtime on the GPU.  ...  The numerical experiments in this work were carried out on a node of the aquila supercomputer at the University of Bath and we are grateful to Steven Chapman for his continuous and tireless technical support  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1302.7193v1">arXiv:1302.7193v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/migpyc2kbrdy7bdfx7tebfhdpe">fatcat:migpyc2kbrdy7bdfx7tebfhdpe</a> </span>
<a target="_blank" rel="noopener" href="https://archive.org/download/arxiv-1302.7193/1302.7193.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> File Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1302.7193v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Time-domain seismic modeling in viscoelastic media for full waveform inversion on heterogeneous computing platforms with OpenCL

Gabriel Fabien-Ouellet, Erwan Gloaguen, Bernard Giroux
<span title="">2017</span> <i title="Elsevier BV"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/zvr2ua2y3zgazjus2tljtegoui" style="color: black;">Computers &amp; Geosciences</a> </i> &nbsp;
To demonstrate the code portability on different architectures, the performance of SeisCL is tested 14 on three different devices: Intel CPUs, NVidia GPUs and Intel Xeon PHI.  ...  We present a program called SeisCL for 2D and 3D viscoelastic FWI in the time 12 domain.  ...  The higher cost for the GPU in 3D for orders 10 and 12 is caused by the limited amount of local memory.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.cageo.2016.12.004">doi:10.1016/j.cageo.2016.12.004</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/2j57qpasmvac5eiodlsa6tdp3u">fatcat:2j57qpasmvac5eiodlsa6tdp3u</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190429024338/http://espace.inrs.ca/6342/1/P003082.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/1d/5f/1d5ff82dabd3d85e782260f71e029bee738aa28d.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.cageo.2016.12.004"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> elsevier.com </button> </a>

ExaStencils: Advanced Multigrid Solver Generation [chapter]

Christian Lengauer, Sven Apel, Matthias Bolten, Shigeru Chiba, Ulrich Rüde, Jürgen Teich, Armin Größlinger, Frank Hannig, Harald Köstler, Lisa Claus, Alexander Grebhahn, Stefan Groth (+5 others)
<span title="">2020</span> <i title="Springer International Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/4rhte7roibctnbr7jz7mi4vyw4" style="color: black;">Lecture Notes in Computational Science and Engineering</a> </i> &nbsp;
Present-day stencil codes are implemented in general-purpose programming languages, such as Fortran, C, or Java, Python or derivates thereof, and harnesses for parallelism, such as OpenMP, OpenCL or MPI  ...  At every layer, the corresponding language expresses not only computational directives but also domain knowledge of the problem and platform to be leveraged for optimization.  ...  We thank the Jülich Supercomputing Center (JSC) for providing access to the supercomputers JUQUEEN and JURECA and the Swiss National Supercomputing Centre (CSCS) for providing computational resources and  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-030-47956-5_14">doi:10.1007/978-3-030-47956-5_14</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/xhbxnt45ynhilgh6vv2ui2n76i">fatcat:xhbxnt45ynhilgh6vv2ui2n76i</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200815064930/https://link.springer.com/content/pdf/10.1007%2F978-3-030-47956-5_14.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/09/21/0921a1dc56ea5cbbbba582c29e7db2ef9c90d2ef.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-030-47956-5_14"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> springer.com </button> </a>

High throughput software for direct numerical simulations of compressible two-phase flows

Babak Hejazialhosseini, Diego Rossinelli, Christian Conti, Petros Koumoutsakos
<span title="">2012</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/zigbcra6rjdivda6lkzknwuo5q" style="color: black;">2012 International Conference for High Performance Computing, Networking, Storage and Analysis</a> </i> &nbsp;
The Navier-Stokes equations are discretized on uniform grids using high order finite volume methods.  ...  The software exploits recent CPU micro-architectures by explicit vectorization and adopts NUMA-aware techniques as well as data and computation reordering.  ...  Olivier Byrde, Adrian Ulrich, Teodoro Brasacchio and Eric Müller of Brutus cluster at ETH Zurich for their exceptional and crucial assistance.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/sc.2012.66">doi:10.1109/sc.2012.66</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/sc/HejazialhosseiniRCK12.html">dblp:conf/sc/HejazialhosseiniRCK12</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/gazke6q2gjbatpksmsplfspkt4">fatcat:gazke6q2gjbatpksmsplfspkt4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20140803012753/http://conferences.computer.org/sc/2012/papers/1000a039.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/4f/81/4f81dfc71b2ba61561bff78a5844b5eabaf563a1.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/sc.2012.66"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Modeling the conflicting demands of parallelism and Temporal/Spatial locality in affine scheduling

Oleksandr Zinenko, Sven Verdoolaege, Chandan Reddy, Jun Shirako, Tobias Grosser, Vivek Sarkar, Albert Cohen
<span title="">2018</span> <i title="ACM Press"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/co3tf2zzendhlbd6zt7kuvv3ha" style="color: black;">Proceedings of the 27th International Conference on Compiler Construction - CC 2018</a> </i> &nbsp;
While the overall problem is not convex, effective algorithms can be derived from this template delivering unprecedented performance portability over GPU and multicore CPU.  ...  We discuss the rationale for this algorithmic template and validate it on representative computational kernels/benchmarks.  ...  We discussed the rationale for this unified algorithm, as well as its validation on representative benchmarks.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/3178372.3179507">doi:10.1145/3178372.3179507</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/cc/ZinenkoVRSGS018.html">dblp:conf/cc/ZinenkoVRSGS018</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/wsluiyrt3rds3ctz3seukznsfi">fatcat:wsluiyrt3rds3ctz3seukznsfi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190501193318/https://hal.inria.fr/hal-01751823/document" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/fe/06/fe06ba58b9b9d609d883fc1392ac17a4cae2e19c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/3178372.3179507"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>

FEAST-realization of hardware-oriented numerics for HPC simulations with finite elements

Stefan Turek, Dominik Göddeke, Christian Becker, Sven H. M. Buijssen, Hilmar Wobker
<span title="2010-05-12">2010</span> <i title="Wiley"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/o454xj4tdjfllm3oragfkdrffi" style="color: black;">Concurrency and Computation</a> </i> &nbsp;
NEC SX 8 and GPU-accelerated clusters.  ...  We demonstrate good performance and weak and strong scalability for the prototypical Poisson problem and more challenging applications from solid mechanics and fluid dynamics.  ...  Thanks to NVIDIA for donating hardware that was used in the development of Feast's GPU backend.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1002/cpe.1584">doi:10.1002/cpe.1584</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/6omy4woanrbn7dqigs3q4ql5hq">fatcat:6omy4woanrbn7dqigs3q4ql5hq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200321031456/http://www.mathematik.tu-dortmund.de/lsiii/cms/papers/TurekGoeddekeBeckerBuijssenWobker2010.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f2/fb/f2fb4befddf0847d68932b0d24b2413efd65b373.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1002/cpe.1584"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> wiley.com </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 69 results