Filters








304 Hits in 7.4 sec

Cross-architecture Kalman filter benchmarks on modern hardware platforms

D H Cámpora Pérez, O Awile, O Bouizi, N Neufeld
<span title="">2018</span> <i title="IOP Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/wxgp7pobnrfetfizidmpebi4qy" style="color: black;">Journal of Physics, Conference Series</a> </i> &nbsp;
In this paper we present performance benchmarks and explore the Intel R Skylake and Intel R Knights Landing architectures in depth.  ...  The Kalman filter is a process of the event reconstruction that, due to its time characteristics and early execution in the selection chain, consumes 40% of the whole reconstruction time in the current  ...  Potterat for his contributions on the validation of the software. Thanks to F. Lemaitre for his contribution of the vectorized transposition code. In addition, thanks to W. Hulsbergen and R.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1088/1742-6596/1085/3/032046">doi:10.1088/1742-6596/1085/3/032046</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/tko2re3sajganczv3q3dfos5fq">fatcat:tko2re3sajganczv3q3dfos5fq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190426140718/http://cds.cern.ch/record/2664982/files/10.1088_1742-6596_1085_3_032046.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/2c/21/2c2171554a3acc92bae08e9d4c5a1070394c787f.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1088/1742-6596/1085/3/032046"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> iop.org </button> </a>

A memory heterogeneity-aware runtime system for bandwidth-sensitive HPC applications

Kavitha Chandrasekar, Xiang Ni, Laxmikant V. Kale
<span title="">2017</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/t3x4vqewrncrfgn2wu7cafsbsq" style="color: black;">2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)</a> </i> &nbsp;
Today's supercomputers are moving towards deployment of many-core processors like Intel Xeon Phi Knights Landing (KNL), to deliver high compute and memory capacity.  ...  To improve performance, architectures like Knights Landing include a high bandwidth and low capacity in-package high bandwidth memory (HBM) in addition to the high capacity but low bandwidth DDR4.  ...  Abstract-Today's supercomputers are moving towards deployment of many-core processors like Intel Xeon Phi Knights Landing (KNL), to deliver high compute and memory capacity.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ipdpsw.2017.168">doi:10.1109/ipdpsw.2017.168</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/ipps/ChandrasekarNK17.html">dblp:conf/ipps/ChandrasekarNK17</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/tm2vcyo7qbe2rdtzimhfv2o47q">fatcat:tm2vcyo7qbe2rdtzimhfv2o47q</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180508103454/http://charm.cs.illinois.edu:80/newPapers/17-04/paper.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c5/43/c543656d22834d177b4dc6ccaa52091ad1810a25.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ipdpsw.2017.168"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor [chapter]

Douglas Doerfler, Jack Deslippe, Samuel Williams, Leonid Oliker, Brandon Cook, Thorsten Kurth, Mathieu Lobet, Tareq Malas, Jean-Luc Vay, Henri Vincenti
<span title="">2016</span> <i title="Springer International Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
Regardless, the penalty on the performance bounds can be severeup to 80× on an Intel Knights Landing processor.  ...  Conclusion and Outlook In this study we have developed a Roofline Model for the Intel Knights Landing processor and have estimated upper bounds for L1, L2, MCDRAM and DDR4.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-46079-6_24">doi:10.1007/978-3-319-46079-6_24</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/4lfmybdu5bdlfotf7ej3n56iq4">fatcat:4lfmybdu5bdlfotf7ej3n56iq4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190222065505/http://pdfs.semanticscholar.org/3aff/5bac6c7e020d6144a3e4b5dccb3aebc75ba1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/3a/ff/3aff5bac6c7e020d6144a3e4b5dccb3aebc75ba1.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-46079-6_24"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

A Performance Model to Execute Workflows on High-Bandwidth-Memory Architectures

Anne Benoit, Swann Perarnau, Loïc Pottier, Yves Robert
<span title="">2018</span> <i title="ACM Press"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/3qrmigawsbhbljazwy2rerl5ni" style="color: black;">Proceedings of the 47th International Conference on Parallel Processing - ICPP 2018</a> </i> &nbsp;
This work presents a realistic performance model to execute scientific workflows on high-bandwidth memory architectures such as the Intel Knights Landing.  ...  Extensive simulations allow us to assess the impact of the mapping strategies on performance.  ...  One of the first widely available systems to exhibit this kind of new memory is Intel's Knights Landing [2, 13, 24] .  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/3225058.3225110">doi:10.1145/3225058.3225110</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/icpp/BenoitPPR18.html">dblp:conf/icpp/BenoitPPR18</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/a2d6eabvpvhhngjyyy75fya7pu">fatcat:a2d6eabvpvhhngjyyy75fya7pu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200319030320/https://www.icl.utk.edu/files/publications/2018/icl-utk-1063-2018.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/fa/b6/fab6ee4d65602876505d1bbc1934de46c03d2aa2.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/3225058.3225110"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>

EVALUATION OF OPENMP OPTIMIZATION IN HETEROGENEOUS COMPUTING MODE BY CODE OFFLOADING ON INTEL XEON PHI CO-PROCESSOR

Kajal Chauhan
<span title="2018-02-20">2018</span> <i title="IJARCS International Journal of Advanced Research in Computer Science"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/dbwzqwxyw5hn3aantchmoxx65q" style="color: black;">International Journal of Advanced Research in Computer Science</a> </i> &nbsp;
The present authors had earlier tested on the Intel Quad Core i7 processor with 8 threads and two Intel Xeon 12 core with 48 threads CPUs for optimization of K-Means clustering image processing code using  ...  The speedup of 5x was achieved on Intel i7 core CPU and 13x was obtained on Intel Xeon CPU when dynamic scheduling as threads deployed were large.  ...  Such a situation is avoided if Knights landing Intel Xeon Phi processor is deployed in place of host processor(s).  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.26483/ijarcs.v9i2.5746">doi:10.26483/ijarcs.v9i2.5746</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/wd54usnvjrabtpclffbp3uqism">fatcat:wd54usnvjrabtpclffbp3uqism</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180721235347/http://www.ijarcs.info/index.php/Ijarcs/article/download/5746/4762" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/47/d2/47d2b775f1c5887ca4a535a0b35baf9f0db7bca6.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.26483/ijarcs.v9i2.5746"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>

Co-scheduling on Upcoming Many-Core Architectures

Simon Pickartz, Jens Breitbart, Stefan Lankes, Josef Weidendorfer, Carsten Clauss, Carsten Trinitis, Stefan Lankes
<span title="2017-01-18">2017</span> <i > <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2vqdbwpljnbcfm7funvunf2kam" style="color: black;">International Conference on High Performance Embedded Architectures and Compilers</a> </i> &nbsp;
Co-scheduling is known to optimize the utilization of supercomputers.  ...  This is especially true for traditional multi-core architecture where a subset of the available cores are already able to saturate the main memory bandwidth.  ...  Furthermore, we want to thank MEGWARE who provided us with a Clustsafe to perform the energy measurements.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.14459/2017md1344415">doi:10.14459/2017md1344415</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/hipeac/PickartzBL17.html">dblp:conf/hipeac/PickartzBL17</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/jdqmyjxdkzeexf5gyyykzcecwq">fatcat:jdqmyjxdkzeexf5gyyykzcecwq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200224184210/http://mediatum.ub.tum.de/doc/1344415/file.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/3c/a8/3ca86aef0e2e6042a6b908645c8817676331962e.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.14459/2017md1344415"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Early Experience on Using Knights Landing Processors for Lattice Boltzmann Applications [chapter]

Enrico Calore, Alessandro Gabbana, Sebastiano Fabio Schifano, Raffaele Tripiccione
<span title="">2018</span> <i title="Springer International Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
The Knights Landing (KNL) is the codename for the latest generation of Intel processors based on Intel Many Integrated Core (MIC) architecture.  ...  We assess the performance of this processor for Lattice Boltzmann codes, widely used in computational fluid-dynamics.  ...  This work was done in the framework of the COKA, COSA and SUMA projects of INFN. We would like to thank CINECA (Italy) for access to their HPC systems.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-78024-5_45">doi:10.1007/978-3-319-78024-5_45</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mih3anqr7zgxnhariki7iw4bjm">fatcat:mih3anqr7zgxnhariki7iw4bjm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200903230959/https://arxiv.org/pdf/1804.01918v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/40/ae/40ae8728e8b8cf2901a0dcd273e831f3e7fd3c96.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-78024-5_45"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Sparse Tensor Factorization on Many-Core Processors with High-Bandwidth Memory

Shaden Smith, Jongsoo Park, George Karypis
<span title="">2017</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/t3x4vqewrncrfgn2wu7cafsbsq" style="color: black;">2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)</a> </i> &nbsp;
These features are exemplified in Intel's recent Knights Landing many-core processor (KNL), which typically has 68 cores and 16GB of on-package multi-channel DRAM (MCDRAM).  ...  To address these challenging demands, HPC systems are turning to many-core architectures that feature a large number of energy-efficient cores backed by high-bandwidth memory.  ...  CONCLUSIONS AND FUTURE WORK We presented the first exploration of sparse tensor factorization on a many-core processor, using the new Xeon Phi Knights Landing processor as a case study.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ipdps.2017.84">doi:10.1109/ipdps.2017.84</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/ipps/SmithPK17.html">dblp:conf/ipps/SmithPK17</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/xykssjzn3vhzdjpe6uz7dcttsm">fatcat:xykssjzn3vhzdjpe6uz7dcttsm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201107020316/https://conservancy.umn.edu/bitstream/handle/11299/216008/14-006_0.pdf;jsessionid=DC0C55316B4BB9108EC50A66FC7052AA?sequence=1" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/1d/e6/1de6ac748387859f43bc15e15ff5380df05bae34.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ipdps.2017.84"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

A Locality-Based Threading Algorithm for the Configuration-Interaction Method

Hongzhang Shan, Samuel Williams, Calvin Johnson, Kenneth McElvain
<span title="">2017</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/t3x4vqewrncrfgn2wu7cafsbsq" style="color: black;">2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)</a> </i> &nbsp;
The new algorithm scales to 256 threads on the 64-core Intel Knights Landing (KNL) manycore processor and 24 threads on dualsocket Ivy Bridge (Xeon) nodes.  ...  Compared with the original implementation, the performance has been improved by up to 7× on the Knights Landing processor and 3× on the dualsocket Ivy Bridge node.  ...  Knights Landing The Knights Landing (KNL) node contains a single, selfhosted Intel Xeon Phi processor with 64 out-of-order superscalar (but to a lesser degree than Ivy Bridge) cores running at speed of  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ipdpsw.2017.15">doi:10.1109/ipdpsw.2017.15</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/ipps/ShanWJM17.html">dblp:conf/ipps/ShanWJM17</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/fs6lz4xcbbegbjwnj7iv2fyqpy">fatcat:fs6lz4xcbbegbjwnj7iv2fyqpy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180726212010/https://cloudfront.escholarship.org/dist/prd/content/qt9sf515zf/qt9sf515zf.pdf?t=op4zps" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/86/86/86869748afff9b2a40d770ef6e704be8daaaa779.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ipdpsw.2017.15"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Multi-threaded ATLAS simulation on Intel Knights Landing processors

Steven Farrell, Paolo Calafiura, Charles Leggett, Vakhtang Tsulaia, Andrea Dotti
<span title="">2017</span> <i title="IOP Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/wxgp7pobnrfetfizidmpebi4qy" style="color: black;">Journal of Physics, Conference Series</a> </i> &nbsp;
The Knights Landing (KNL) release of the Intel Many Integrated Core (MIC) Xeon Phi line of processors is a potential game changer for HEP computing.  ...  Cori Phase 2 is based on the KNL architecture and contains over 9000 compute nodes with 96GB DDR4 memory.  ...  Xeon Phi chips run a Linux OS, making them substantially easier to use than FPGAs and GPUs. The current (2nd) generation of Xeon Phi processors is codenamed Knights Landing (KNL).  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1088/1742-6596/898/4/042012">doi:10.1088/1742-6596/898/4/042012</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/fkut7fmzv5fqzjymrhpmpe6dpa">fatcat:fkut7fmzv5fqzjymrhpmpe6dpa</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190503181833/https://iopscience.iop.org/article/10.1088/1742-6596/898/4/042012/pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/dc/29/dc29d7f0ab10600e1e20eaf752ea7b0fdaff44bb.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1088/1742-6596/898/4/042012"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> iop.org </button> </a>

Coherence Traffic in Manycore Processors with Opaque Distributed Directories [article]

Steve Kommrusch, Marcos Horro, Louis-Noël Pouchet, Gabriel Rodríguez, Juan Touriño
<span title="2020-11-10">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
This paper studies the physical layout of an Intel Knights Landing processor, with a particular focus on the coherence subsystem, and uncovers the pseudo-random mapping function of physical memory blocks  ...  Manycore processors feature a high number of general-purpose cores designed to work in a multithreaded fashion. Recent manycore processors are kept coherent using scalable distributed directories.  ...  of Spain (FPU16/00816), and by the U.S.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2011.05422v1">arXiv:2011.05422v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/277m4su4bnholla6fiv4vwgx54">fatcat:277m4su4bnholla6fiv4vwgx54</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201113030830/https://arxiv.org/pdf/2011.05422v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/9e/a7/9ea71b7723fa636b1ada1f14322fa9a8e355be18.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2011.05422v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Evaluating and Optimizing the NERSC Workload on Knights Landing

Taylor Barnes, Brandon Cook, Jack Deslippe, Douglas Doerfler, Brian Friesen, Yun He, Thorsten Kurth, Tuomas Koskela, Mathieu Lobet, Tareq Malas, Leonid Oliker, Andrey Ovsyannikov (+10 others)
<span title="">2016</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/zigbcra6rjdivda6lkzknwuo5q" style="color: black;">2016 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)</a> </i> &nbsp;
NERSC has partnered with 20 representative application teams to evaluate performance on the Xeon-Phi Knights Landing architecture and develop an application-optimization strategy for the greater NERSC  ...  In this article, we present early case studies and summarized results from a subset of the 20 applications highlighting the impact of important architecture differences between the Xeon-Phi and traditional  ...  On-package MCDRAM -The Knights Landing processor has 1 MB of L2 cache per tile (shared between two compute cores) but lacks a shared L3 cache.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/pmbs.2016.010">doi:10.1109/pmbs.2016.010</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/sc/BarnesCDDFHKKLM16.html">dblp:conf/sc/BarnesCDDFHKKLM16</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/a5mvtzldincrpjtzeft6qlydyq">fatcat:a5mvtzldincrpjtzeft6qlydyq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180724014342/https://cloudfront.escholarship.org/dist/prd/content/qt75c1571h/qt75c1571h.pdf?t=ooxubw" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/dd/10/dd10952148652d0617ae94cda3fbfb3cf44e4f8f.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/pmbs.2016.010"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Exposing the Locality of Heterogeneous Memory Architectures to HPC Applications

Brice Goglin
<span title="">2016</span> <i title="ACM Press"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/wfntzrhhincf5m7bhuvjkqmkpy" style="color: black;">Proceedings of the Second International Symposium on Memory Systems - MEMSYS &#39;16</a> </i> &nbsp;
We present an in-depth study of the software view of the upcoming Intel Knights Landing processor.  ...  Therefore locality is a major area of optimization on the road to exascale. Indeed, tasks and data have to be carefully distributed on the computing and memory resources.  ...  ACKNOWLEDGMENTS We would like to thank Intel for providing us with hints for designing our new hwloc model.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/2989081.2989115">doi:10.1145/2989081.2989115</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/memsys/Goglin16.html">dblp:conf/memsys/Goglin16</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/eev2v2bomzcdri2gnsnfyn3fey">fatcat:eev2v2bomzcdri2gnsnfyn3fey</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170924080956/https://hal.inria.fr/hal-01330194/document" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ec/38/ec3846ece46449114a7f68f81429658f5ae110cf.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/2989081.2989115"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>

Code modernization strategies to 3-D Stencil-based applications on Intel Xeon Phi: KNC and KNL

Juan M. Cebrián, José M. Cecilia, Mario Hernández, José M. García
<span title="">2017</span> <i title="Elsevier BV"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/nkrwe4pmozafvnd72yxufztpku" style="color: black;">Computers and Mathematics with Applications</a> </i> &nbsp;
These techniques lead a performance gains of up to 15x for the first generation of Xeon Phi: Knights Corner (KNC), and an additional average 2.5x improvement for Knights Landing (KNL).  ...  These methods produce an approximate solution to the problem based on Stencil patterns of computation.  ...  Our modernized codes are optimized on the first Xeon Phi generation (Knights Corner, or KNC), and then migrated to the (recently released) second generation, Knights Landing (or KNL).  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.camwa.2017.07.032">doi:10.1016/j.camwa.2017.07.032</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/x5tlbmuvizchfesgoyscbf43e4">fatcat:x5tlbmuvizchfesgoyscbf43e4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200311160123/https://upcommons.upc.edu/bitstream/handle/2117/109714/Code%20Modernization%20Strategies%20to%203-D%20Stencil-based.pdf;jsessionid=FE7E0BBDD18F2EAAE81F78D5DBD85AA9?sequence=1" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a7/9a/a79ac339e7d316d5c9eaeb081ee45d2d7764d135.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.camwa.2017.07.032"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> elsevier.com </button> </a>

Optimizing Coherence Traffic in Manycore Processors using Closed-Form Caching/Home Agent Mappings

Steve Kommrusch, Marcos Horro, Louis-Noel Pouchet, Gabriel Rodriguez, Juan Tourino
<span title="">2021</span> <i title="Institute of Electrical and Electronics Engineers (IEEE)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/q7qi7j4ckfac7ehf3mjbso4hne" style="color: black;">IEEE Access</a> </i> &nbsp;
This paper studies the physical layout of an Intel Knights Landing processor, with a particular focus on the coherence subsystem, and uncovers the pseudo-random mapping function of physical memory blocks  ...  Manycore processors feature a high number of general-purpose cores designed to work in a multithreaded fashion. Recent manycore processors are kept coherent using scalable distributed directories.  ...  BACKGROUND AND OVERVIEW This paper studies the Intel Knights Landing (KNL) architecture as a paramount example of the Intel Mesh interconnect.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/access.2021.3058280">doi:10.1109/access.2021.3058280</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/zd26nmknuvfpbi5mtvhj6zz2ry">fatcat:zd26nmknuvfpbi5mtvhj6zz2ry</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210717074527/https://ieeexplore.ieee.org/ielx7/6287639/9312710/09350627.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/72/b1/72b1b4bb99256b57d76bde11b068e7cff2f82819.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/access.2021.3058280"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> ieee.com </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 304 results