Filters








20,202 Hits in 6.0 sec

Introducing a Performance Model for Bandwidth-Limited Loop Kernels [article]

Jan Treibig, Georg Hager
<span title="2009-05-06">2009</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We present a performance model for bandwidth limited loop kernels which is founded on the analysis of modern cache based microarchitectures.  ...  This model allows an accurate performance prediction and evaluation for existing instruction codes.  ...  Conclusion The proposed model introduces a systematic approach to understand the performance of bandwidth-limited loop kernels, especially for in-cache situations.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/0905.0792v1">arXiv:0905.0792v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/euqzc224ibbftnfgrsomg7oq7i">fatcat:euqzc224ibbftnfgrsomg7oq7i</a> </span>
<a target="_blank" rel="noopener" href="https://archive.org/download/arxiv-0905.0792/0905.0792.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> File Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/0905.0792v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Lattice Boltzmann benchmark kernels as a testbed for performance analysis

M. Wittmann, V. Haag, T. Zeiser, H. Köstler, G. Wellein
<span title="">2018</span> <i title="Elsevier BV"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/wp3g5bo4ojczdinvl7cavuwkq4" style="color: black;">Computers &amp; Fluids</a> </i> &nbsp;
In this paper we give an overview of already available kernels, establish a performance model for each kernel, and show a comparison of implementations and recent architectures.  ...  The kernels may act as an reference for performance comparisons and as a blue print for optimization strategies.  ...  Acknowledgments We would like to thank Christoph Rettinger (Chair for System Simulation), for helpful discussions regarding verification.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.compfluid.2018.03.030">doi:10.1016/j.compfluid.2018.03.030</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/n7e7b62ihjfdpob5kdcnnz3g2q">fatcat:n7e7b62ihjfdpob5kdcnnz3g2q</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200829120428/https://arxiv.org/pdf/1711.11468v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/fb/c2/fbc2497a9891f084cc92e514b4148f7a24d588d9.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.compfluid.2018.03.030"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> elsevier.com </button> </a>

Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX [article]

Christie L. Alappat, Jan Laukemann, Thomas Gruber, Georg Hager, Gerhard Wellein, Nils Meyer, Tilo Wettig
<span title="2020-09-29">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Using these features, we construct the Execution-Cache-Memory (ECM) performance model for the A64FX processor in the FX700 supercomputer and validate it using streaming loops.  ...  Generating efficient code for such a new architecture requires a good understanding of its performance features.  ...  that the read-only memory bandwidth was used as a limit for SUM.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2009.13903v1">arXiv:2009.13903v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/f5iikrcor5aurlwjpe4d74e3xy">fatcat:f5iikrcor5aurlwjpe4d74e3xy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201012001043/https://arxiv.org/pdf/2009.13903v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c6/f8/c6f89f02330976d504e04d3b47721ca92b026a82.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2009.13903v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Multi-core architectures: Complexities of performance prediction and the impact of cache topology [article]

Jan Treibig, Georg Hager, Gerhard Wellein
<span title="2009-10-26">2009</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The balance metric is a simple approach to estimate the performance of bandwidth-limited loop kernels.  ...  This paper analyzes the in uence of cache hierarchy design on performance predictions for bandwidth-limited loop kernels on current mainstream processors.  ...  Acknowledgments We thank Darren Kerbyson (LANL), Herbert Cornelius (Intel Germany), Michael Meier (RRZE), and Matthias Müller (ZIH) for fruitful discussions.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/0910.4865v1">arXiv:0910.4865v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/6qtokh4uuzauzcmyagmqsruzne">fatcat:6qtokh4uuzauzcmyagmqsruzne</a> </span>
<a target="_blank" rel="noopener" href="https://archive.org/download/arxiv-0910.4865/0910.4865.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> File Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/0910.4865v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Analysis of Intel's Haswell Microarchitecture Using The ECM Model and Microbenchmarks [article]

Johannes Hofmann, Dietmar Fey, Jan Eitzinger, Georg Hager, Gerhard Wellein
<span title="2015-11-13">2015</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The set of microbenchmarks is chosen such that it can serve as a blueprint for other streaming loop kernels.  ...  This paper presents an in-depth analysis of Intel's Haswell microarchitecture for streaming loop kernels.  ...  As a consequence the model yields a practical upper limit for single core performance.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1511.03639v2">arXiv:1511.03639v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/t7hbfcikkzeoxc7kmkinl3argm">fatcat:t7hbfcikkzeoxc7kmkinl3argm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200905193713/https://arxiv.org/pdf/1511.03639v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c8/83/c8838abcf70731ac14e8d82096426ffe1538b088.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1511.03639v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Evaluating attainable memory bandwidth of parallel programming models via BabelStream

Matt Martineau, Simon McIntosh Smith, James Price, Tom Deakin
<span title="">2017</span> <i title="Inderscience Publishers"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/mvlpi4a7izattgygry5kh4ruaq" style="color: black;">International Journal of Computational Science and Engineering (IJCSE)</a> </i> &nbsp;
The choice of one programming model over another should ideally not limit the performance that can be achieved on a device.  ...  We augment the standard set of STREAM kernels with a dot product kernel to investigate the performance of simple reduction operations on large arrays.  ...  Our thanks to Codeplay for early access to the ComputeCpp SYCL compiler and to Douglas Miles at PGI (NVIDIA) for access to the PGI compiler.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1504/ijcse.2017.10011352">doi:10.1504/ijcse.2017.10011352</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/rn76twry4fd7jcfwlqlixv7ulu">fatcat:rn76twry4fd7jcfwlqlixv7ulu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200310033823/https://research-information.bris.ac.uk/ws/files/109051447/gpu_stream_2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a5/d9/a5d9d1e521e0e62d96dacf234243e68843ef5db5.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1504/ijcse.2017.10011352"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Achieving high memory performance from heterogeneous architectures with the SARC programming model

Roger Ferrer, Vicenç Beltran, Marc González, Xavier Martorell, Eduard Ayguadé
<span title="">2009</span> <i title="ACM Press"> Proceedings of the 10th MEDEA workshop on MEmory performance DEaling with Applications, systems and architecture - MEDEA &#39;09 </i> &nbsp;
Results indicate that the programming model is able to achieve up to 85% of the peak memory bandwidth on the Cell/B.E. processor.  ...  In this paper, we want to present the results we obtain from the coding with the SARC Programming Model, of two benchmarks, matrix multiply and conjugate gradient (NAS CG), with respect memory bandwidth  ...  for the discussions on the parallelization of NAS CG with the SARC Programming Model.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/1621960.1621963">doi:10.1145/1621960.1621963</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/z2v6z6kgj5gmzckc72fci6uwge">fatcat:z2v6z6kgj5gmzckc72fci6uwge</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170809045308/http://hpc.ac.upc.edu/PDFs/dir16/file003641.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/3c/a7/3ca7f86e59b3fdb85a54fa5892c1374befa8cd8a.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/1621960.1621963"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>

Bridging the Architecture Gap: Abstracting Performance-Relevant Properties of Modern Server Processors

<span title="">2020</span> <i title="FSAEIHE South Ural State University (National Research University)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ut52ymkqynfbhjyzwhwl44xhvq" style="color: black;">Supercomputing Frontiers and Innovations</a> </i> &nbsp;
We introduce a generic method for determining machine models, and present results for relevant server-processor architectures by Intel, AMD, IBM, and Marvell/Cavium.  ...  We propose several improvements to the execution-cache-memory (ECM) model, an analytic performance model for predicting single-and multicore runtime of steady-state loops on server processors.  ...  We also thank the Center for Information Services and High Performance Computing (ZIH) at TU Dresden for providing access to their Power9 cluster.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.14529/jsfi200204">doi:10.14529/jsfi200204</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ed2qa525pnghffqnuzz2xjvayi">fatcat:ed2qa525pnghffqnuzz2xjvayi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200729122304/https://superfri.org/superfri/article/download/310/350" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/2d/e2/2de2b4811c4a6ef5bf42c80f9dd149084fdd0cdb.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.14529/jsfi200204"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>

Performance Modeling of the HPCG Benchmark [chapter]

Vladimir Marjanović, José Gracia, Colin W. Glass
<span title="">2015</span> <i title="Springer International Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
Furthermore, we present a model, capable of predicting the performance of HPCG on a given architecture, based solely on two inputs: the effective bandwidth between the main memory and the CPU and the highest  ...  The TOP 500 list is the most widely regarded ranking of modern supercomputers, based on Gflop/s measured for High Performance LINPACK (HPL).  ...  Acknowledgements The authors would like to thank Mandes Schönherr for valuable contributions. This research is partly supported by EU project POLCA (FP7-ICT-2013-10, grant agreement no. 610686).  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-17248-4_9">doi:10.1007/978-3-319-17248-4_9</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ht44bpkdibazpd2lwtcgkfa6ci">fatcat:ht44bpkdibazpd2lwtcgkfa6ci</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170809104515/http://www.dcs.warwick.ac.uk/~sdh/pmbs14/PMBS14/Workshop_Schedule_files/10-PerformanceModelHPCG.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/6c/d3/6cd3eaac09ccb608e30ce1cc85fb1c2971b958f5.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-17248-4_9"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Pragmatic Performance Portability with OpenMP 4.x [chapter]

Matt Martineau, James Price, Simon McIntosh-Smith, Wayne Gaudin
<span title="">2016</span> <i title="Springer International Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
We outline the mechanisms that they use to map the OpenMP model onto their target architectures, and conduct performance testing with a number of representative data parallel kernels.  ...  Following this we present a discussion about the current state of play in terms of performance portability and propose some straightforward guidelines for writing performance portable code, derived from  ...  Clang has poor performance for the column indirect kernel, and this is because the inner loop cannot be collapsed into the iteration space, which limits the available work to the length of the outer loop  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-45550-1_18">doi:10.1007/978-3-319-45550-1_18</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cizdknklfzarbknnayx6fh655m">fatcat:cizdknklfzarbknnayx6fh655m</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180722070815/https://research-information.bristol.ac.uk/ws/files/108675132/pragmatic_performance_portability_with_openmp4.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/85/c9/85c9186c5b91b6c7efe4d76a0859c5e345d6bc99.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-45550-1_18"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Automatic loop kernel analysis and performance modeling with Kerncraft

Julian Hammer, Georg Hager, Jan Eitzinger, Gerhard Wellein
<span title="">2015</span> <i title="ACM Press"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/zigbcra6rjdivda6lkzknwuo5q" style="color: black;">Proceedings of the 6th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computing Systems - PMBS &#39;15</a> </i> &nbsp;
Analytic performance models are essential for understanding the performance characteristics of loop kernels, which consume a major part of CPU cycles in computational science.  ...  We present the "Kerncraft" tool, which eases the construction of analytic performance models for streaming kernels and stencil loop nests.  ...  to yield performance predictions using the ECM and Roofline models for steady-state loop kernels.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/2832087.2832092">doi:10.1145/2832087.2832092</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/sc/HammerHEW15.html">dblp:conf/sc/HammerHEW15</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/vsyvvnthjne27di4tzt4qkcb6m">fatcat:vsyvvnthjne27di4tzt4qkcb6m</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200905201247/https://arxiv.org/pdf/1509.03778v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/6a/fb/6afbcaa3ddc8cc640387dc4f0947e85327eea3fb.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/2832087.2832092"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>

Optimization and performance evaluation of the IDR iterative Krylov solver on GPUs

Hartwig Anzt, Moritz Kreutzer, Eduardo Ponce, Gregory D Peterson, Gerhard Wellein, Jack Dongarra
<span title="2016-05-05">2016</span> <i title="SAGE Publications"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/prmoui3bz5a2jktzh4pt7dtery" style="color: black;">The international journal of high performance computing applications</a> </i> &nbsp;
A comprehensive performance evaluation is conducted using a suitable performance model.  ...  kernel execution.  ...  Larger problems provide more parallelism, which brings the achieved bandwidth closer to the maximum bandwidth the roofline performance model is based on (see Section 6).  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1177/1094342016646844">doi:10.1177/1094342016646844</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/oyteuyn6qfdw7cf5ya25uag5jq">fatcat:oyteuyn6qfdw7cf5ya25uag5jq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190218130503/http://pdfs.semanticscholar.org/0c0c/02a4f10abeccd34ece71c05af1c12c72f31d.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/0c/0c/0c0c02a4f10abeccd34ece71c05af1c12c72f31d.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1177/1094342016646844"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> sagepub.com </button> </a>

K-Athena: a performance portable structured grid finite volume magnetohydrodynamics code [article]

Philipp Grete, Forrest W. Glines, Brian W. O'Shea
<span title="2020-07-15">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Using a roofline analysis we demonstrate that the overall performance is currently limited by DRAM bandwidth and calculate a performance portability metric of 62.8 used and the challenges encountered in  ...  Performance portability is required to prevent repeated non-trivial refactoring of a code for different architectures.  ...  ACKNOWLEDGMENTS The authors would like to thank the KOKKOS developers, particularly Christian Trott and Steve Bova, and the organizers of the 2018 Performance Portability with KOKKOS Bootcamp for their  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1905.04341v2">arXiv:1905.04341v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/rrlmbihwwje4pbsffhhwr2ofym">fatcat:rrlmbihwwje4pbsffhhwr2ofym</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200718001348/https://arxiv.org/pdf/1905.04341v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/00/12/0012e7844aad4d1efc0005fcaba11ca4d3537e89.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1905.04341v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Dynamic Intelligent Feedback Scheduling in Networked Control Systems

Hui-ying Chen, Zu-xin Li, Pei-liang Wang
<span title="">2013</span> <i title="Hindawi Limited"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/wpareqynwbgqdfodcyhh36aqaq" style="color: black;">Mathematical Problems in Engineering</a> </i> &nbsp;
For the networked control system with limited bandwidth and flexible workload, a dynamic intelligent feedback scheduling strategy is proposed.  ...  At the same time, the dynamic performance indices of all control loops are obtained with a two-dimensional fuzzy logic modulator.  ...  Acknowledgments The authors thank the reviewers for their very helpful comments and suggestions, which have improved the presentation of this paper. This study is supported by the National Nat-  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1155/2013/584393">doi:10.1155/2013/584393</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/c77lkq37brhm7d3xcbihcmlzaa">fatcat:c77lkq37brhm7d3xcbihcmlzaa</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190502110521/http://downloads.hindawi.com/journals/mpe/2013/584393.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e2/3e/e23e5042d06cddcee6985ad9a5a855675668d3d8.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1155/2013/584393"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> hindawi.com </button> </a>

GPU-STREAM v2.0: Benchmarking the Achievable Memory Bandwidth of Many-Core Processors Across Diverse Parallel Programming Models [chapter]

Tom Deakin, James Price, Matt Martineau, Simon McIntosh-Smith
<span title="">2016</span> <i title="Springer International Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
Our thanks to Codeplay for access to the ComputeCpp SYCL compiler and to Douglas Miles at PGI (NVIDIA) for access to the PGI compiler.  ...  Thanks also go to the University of Oxford for access to the Power 8 system.  ...  Performance We display the fraction of peak memory bandwidth we were able to achieve for a variety of devices against each programming model in Fig. 10 .  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-46079-6_34">doi:10.1007/978-3-319-46079-6_34</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/6l3mt6xn65d67fwmemxiaqf5um">fatcat:6l3mt6xn65d67fwmemxiaqf5um</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190501130039/https://research-information.bristol.ac.uk/files/79450418/gpu_stream_2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/66/59/6659231d3ae1b866f2aba5d1239295ea78c6db24.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-46079-6_34"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 20,202 results