Filters








86,182 Hits in 3.6 sec

Avoiding communication in sparse matrix computations

James Demmel, Mark Hoemmen, Marghoob Mohiyuddin, Katherine Yelick
<span title="">2008</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/5vsih2yegrfubf7el6ncng3mgq" style="color: black;">Proceedings, International Parallel and Distributed Processing Symposium (IPDPS)</a> </i> &nbsp;
In this paper we focus on an alternative building block for sparse iterative solvers, the "matrix powers kernel" [x, Ax, A 2 x, . . . , A k x], and show that by organizing computations around this kernel  ...  As the gap between computation and communication speed continues to widen, these traditional sparse methods will suffer.  ...  and may favor algorithms with higher computational cost if they avoid communication.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ipdps.2008.4536305">doi:10.1109/ipdps.2008.4536305</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/ipps/DemmelHMY08.html">dblp:conf/ipps/DemmelHMY08</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/kg27cierrjak7fjelgfawn3ybi">fatcat:kg27cierrjak7fjelgfawn3ybi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20090709152748/http://www.cse.psu.edu/~raghavan/cse598C/Demmel.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/92/d3/92d39765a46275b78d33f53fc9158090a41f82ce.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ipdps.2008.4536305"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Communication-Avoiding Krylov Techniques on Graphic Processing Units

Maryam MehriDehnavi, Yousef El-Kurdi, James Demmel, Dennis Giannacopoulos
<span title="">2013</span> <i title="Institute of Electrical and Electronics Engineers (IEEE)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/vszaq5iywbbidortfyofkor4lu" style="color: black;">IEEE transactions on magnetics</a> </i> &nbsp;
Communication-avoiding techniques reduce the communication cost of Krylov subspace methods by computing several vectors of a Krylov subspace "at once," using a kernel called "matrix powers."  ...  The matrix powers kernel is implemented on a recent generation of NVIDIA GPUs and speedups of up to 5.7 times are reported for the communication-avoiding matrix powers kernel compared to the standards  ...  The proposed implementation of the communication-avoiding matrix powers kernel on GPUs will be used in communication-avoiding KSMs in future work. III.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/tmag.2013.2244861">doi:10.1109/tmag.2013.2244861</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/pk3apxr6bbcjxmpzci6v3qv6ka">fatcat:pk3apxr6bbcjxmpzci6v3qv6ka</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20181030044333/http://digitool.library.mcgill.ca/webclient/StreamGate?folder_id=0&amp;dvs=1540874594446~136" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/02/6b/026b33f46477db7efde7445a8856b4a327fe632b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/tmag.2013.2244861"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Communication-optimal iterative methods

J Demmel, M Hoemmen, M Mohiyuddin, K Yelick
<span title="2009-07-01">2009</span> <i title="IOP Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/wxgp7pobnrfetfizidmpebi4qy" style="color: black;">Journal of Physics, Conference Series</a> </i> &nbsp;
By reorganizing the sparse matrix kernel to compute a set of matrix-vector products at once and reorganizing the rest of the algorithm accordingly, we can perform s iterations by sending O(log P ) messages  ...  Here, s iterations of algorithms such as CG, GMRES, Lanczos, and Arnoldi perform s sparse matrix-vector multiplications and Ω(s) vector reductions, resulting in a growth of Ω(s) in both single-node and  ...  Using ideas previously introduced in [10, 11] , we describe how to avoid all but Θ(1) communication while computing W .  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1088/1742-6596/180/1/012040">doi:10.1088/1742-6596/180/1/012040</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/7h56tbf3sfgfnezaisyxkmwpbe">fatcat:7h56tbf3sfgfnezaisyxkmwpbe</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220203040224/https://iopscience.iop.org/article/10.1088/1742-6596/180/1/012040/pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ff/16/ff16dbe4b62d886ce4729cfe659c246cfe5cd1a3.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1088/1742-6596/180/1/012040"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> iop.org </button> </a>

SMASH

Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina Giannoula, Roknoddin Azizi, Skanda Koppula, Nika Mansouri Ghiasi, Taha Shahroodi, Juan Gomez Luna, Onur Mutlu
<span title="">2019</span> <i title="ACM Press"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/sbn7zssnbfcjpjzmgy24brukwi" style="color: black;">Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture - MICRO &#39;52</a> </i> &nbsp;
SMASH exposes an expressive and rich ISA to communicate with the BMU, which enables its use in accelerating any sparse matrix computation.  ...  These operations use sparse matrix compression as an effective means to avoid storing zeros and performing unnecessary computation on zero elements.  ...  This research was supported in part by the Semiconductor Research Corporation.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/3352460.3358286">doi:10.1145/3352460.3358286</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/micro/KanellopoulosVG19.html">dblp:conf/micro/KanellopoulosVG19</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/m2gculfjibgqdhzmwetwzh2g7e">fatcat:m2gculfjibgqdhzmwetwzh2g7e</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200906164756/https://arxiv.org/pdf/1910.10776v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e2/06/e206fb8f900595392291f1eddd1017e5f9f8a1aa.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/3352460.3358286"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>

The parallelism motifs of genomic data analysis

Katherine Yelick, Aydın Buluç, Muaaz Awan, Ariful Azad, Benjamin Brock, Rob Egan, Saliya Ekanayake, Marquita Ellis, Evangelos Georganas, Giulia Guidi, Steven Hofmeyr, Oguz Selvitopi (+2 others)
<span title="2020-01-20">2020</span> <i title="The Royal Society"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ercgg4vn2fenngurcnadfzdfri" style="color: black;">Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences</a> </i> &nbsp;
Enormous community databases store and share these data with the research community, but some of these genomic data analysis problems require large-scale computational platforms to meet both the memory  ...  and computational requirements.  ...  Colella 7 Dwarfs Berkeley View Motifs NRC 7 Giants Genomics Motifs Dense Matrix Dense Matrix Dense and Dense Matrix Sparse Matrix Sparse Matrix Sparse Matrix Sparse Matrix Structured Grid  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1098/rsta.2019.0394">doi:10.1098/rsta.2019.0394</a> <a target="_blank" rel="external noopener" href="https://www.ncbi.nlm.nih.gov/pubmed/31955674">pmid:31955674</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/kzujmq5u2refvhoovtb2ap5vha">fatcat:kzujmq5u2refvhoovtb2ap5vha</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200321140909/https://arxiv.org/pdf/2001.06989v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1098/rsta.2019.0394"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>

The Chunks and Tasks Matrix Library 2.0 [article]

Emanuel H. Rubensson, Elias Rudberg, Anastasia Kruchinina, Anton G. Artemov
<span title="2020-11-23">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The library implements a number of sparse matrix algorithms for distributed memory parallelization that are able to dynamically exploit data locality to avoid movement of data.  ...  We present a C++ header-only parallel sparse matrix library, based on sparse quadtree representation of matrices using the Chunks and Tasks programming model.  ...  Computational resources were provided by the Swedish National Infrastructure for Computing (SNIC) at the PDC Center for High Performance Computing, KTH Royal Institute of Technology in Stockholm.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2011.11762v1">arXiv:2011.11762v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mjrowql7rvcdnmdmk6vagxgdp4">fatcat:mjrowql7rvcdnmdmk6vagxgdp4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201128224957/https://arxiv.org/pdf/2011.11762v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/1e/83/1e831b4663dcfbc7ad6b792eac99f6a5fa740599.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2011.11762v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

A 2D algorithm with asymmetric workload for the UPC conjugate gradient method

Jorge González-Domínguez, Osni A. Marques, María J. Martín, Juan Touriño
<span title="2014-09-18">2014</span> <i title="Springer Nature"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/qhbautqnvzgwvm3vvdvieylhwq" style="color: black;">Journal of Supercomputing</a> </i> &nbsp;
Firstly, typical 1D and 2D distributions of the matrix involved in CG computations are considered.  ...  Then, a new 2D version of the CG method with asymmetric workload, based on leaving some threads idle during part of the computation to reduce communication, is proposed.  ...  In the benchmarks, the CG method is used to compute an approximation for the smallest eigenvalue of a sparse symmetric positive definite matrix.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s11227-014-1300-0">doi:10.1007/s11227-014-1300-0</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cu24226h7fbrrkfj5tqmyxzqea">fatcat:cu24226h7fbrrkfj5tqmyxzqea</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170705113620/http://www.des.udc.es/%7Ejuan/papers/JoS2014.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/37/8a/378af2ac8c10e4b4ff29009cb7685a94a48f7fc3.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s11227-014-1300-0"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

DBCSR: A Blocked Sparse Tensor Algebra Library [article]

Ilia Sivkov, Patrick Seewald, Alfio Lazzaro, Juerg Hutter
<span title="2019-10-29">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Examples are sparse matrix-matrix multiplications in linear-scaling Kohn-Sham calculations or the efficient determination of the exact exchange energy.  ...  In particular, we introduce the tensor contraction based on a fast rectangular sparse matrix multiplication algorithm.  ...  supported by grants from the Swiss National Supercomputing Centre (CSCS) under projects S238 and UZHP and received funding from the Swiss University Conference through the Platform for Advanced Scientific Computing  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1910.13555v1">arXiv:1910.13555v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cyqax4amdnhqnkbnjob3myltdm">fatcat:cyqax4amdnhqnkbnjob3myltdm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200824185901/https://arxiv.org/pdf/1910.13555v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ae/b7/aeb70214e770a7667ff9845335270c234f9b1f2f.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1910.13555v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Dataflow acceleration of Krylov subspace sparse banded problems

Pavel Burovskiy, Stephen Girdlestone, Craig Davies, Spencer Sherwin, Wayne Luk
<span title="">2014</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/plojvu5mhreuxnue5ey7fivkbi" style="color: black;">2014 24th International Conference on Field Programmable Logic and Applications (FPL)</a> </i> &nbsp;
Most of the efforts in the FPGA community related to sparse linear algebra focus on increasing the degree of internal parallelism in matrix-vector multiply kernels.  ...  We illustrate our approach for Google PageRank computation by power iteration for large banded single precision sparse matrices.  ...  ACKNOWLEDGMENT This work is supported in part by the European Union Seventh Framework Programme under grant agreement number 257906, 287804 and 318521, by the UK EPSRC, by the Maxeler University Programme  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/fpl.2014.6927453">doi:10.1109/fpl.2014.6927453</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/fpl/BurovskiyGDSL14.html">dblp:conf/fpl/BurovskiyGDSL14</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/kqiawknxq5htzdxe6zfnmua3ue">fatcat:kqiawknxq5htzdxe6zfnmua3ue</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170813132525/http://spiral.imperial.ac.uk/bitstream/10044/1/23844/2/fpl14pb-final.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/20/07/200724778aea8a65a26568c8d41291fce6051f75.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/fpl.2014.6927453"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

GHOST: Building Blocks for High Performance Sparse Linear Algebra on Heterogeneous Systems

Moritz Kreutzer, Jonas Thies, Melven Röhrig-Zöllner, Andreas Pieper, Faisal Shahzad, Martin Galgon, Achim Basermann, Holger Fehske, Georg Hager, Gerhard Wellein
<span title="2016-10-01">2016</span> <i title="Springer Nature"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/6ni4hdyv7zdtzjxnzscp53l5ry" style="color: black;">International journal of parallel programming</a> </i> &nbsp;
The "General, Hybrid, and Optimized Sparse Toolkit" (GHOST) is a collection of building blocks that targets algorithms dealing with sparse matrix representations on current and future large-scale systems  ...  While many of the architectural details of future exascale-class high performance computer systems are still a matter of intense research, there appears to be a general consensus that they will be strongly  ...  Special thanks go to Andreas Alvermann for providing sparse matrix generation functions for testing and everyone else who contributed to GHOST, directly or indirectly.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s10766-016-0464-z">doi:10.1007/s10766-016-0464-z</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/w64ypr4otfgcxbfftzzuko5nim">fatcat:w64ypr4otfgcxbfftzzuko5nim</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170922023510/http://elib.dlr.de/100066/1/ghost.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/17/af/17afbdc3bf09d3ffcd11a5a8bfe076b8fe6fd47c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s10766-016-0464-z"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Communication optimal parallel multiplication of sparse random matrices

Grey Ballard, Aydin Buluc, James Demmel, Laura Grigori, Benjamin Lipshitz, Oded Schwartz, Sivan Toledo
<span title="">2013</span> <i title="ACM Press"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/tewj77cuufbzbgbk265bb462ga" style="color: black;">Proceedings of the 25th ACM symposium on Parallelism in algorithms and architectures - SPAA &#39;13</a> </i> &nbsp;
Parallel algorithms for sparse matrix-matrix multiplication typically spend most of their time on inter-processor communication rather than on computation, and hardware trends predict the relative cost  ...  Thus, sparse matrix multiplication algorithms must minimize communication costs in order to scale to large processor counts.  ...  We show in this paper that existing algorithms for sparse matrix-matrix multiplication are not optimal in their communication costs, and we obtain new algorithms which are communication optimal, communicating  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/2486159.2486196">doi:10.1145/2486159.2486196</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/spaa/BallardBDGLST13.html">dblp:conf/spaa/BallardBDGLST13</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cxmtrglw7rdkzdg2in777endyu">fatcat:cxmtrglw7rdkzdg2in777endyu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170830011705/https://crd.lbl.gov/assets/pubs_presos/spaa134-ballard.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e2/74/e2740a18e0900514bc5c95d43178865e06b592b4.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/2486159.2486196"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>

Communication Optimization of Iterative Sparse Matrix-Vector Multiply on GPUs and FPGAs

Abid Rafique, George A. Constantinides, Nachiket Kapre
<span title="">2015</span> <i title="Institute of Electrical and Electronics Engineers (IEEE)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ll6bfs5o6bahfinh3u5z2cnyoy" style="color: black;">IEEE Transactions on Parallel and Distributed Systems</a> </i> &nbsp;
Trading communication with redundant computation can increase the silicon efficiency of FPGAs and GPU in accelerating communication-bound sparse iterative solvers.  ...  growth in redundant computation.  ...  Also, we would like to thank Mark Hoemmen and Marghoob Mohiyuddin, University of California Berkeley for giving useful suggestions that help in improving the draft.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/tpds.2014.6">doi:10.1109/tpds.2014.6</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/rm3g45jhbjgbvkqisdrw3tgmxi">fatcat:rm3g45jhbjgbvkqisdrw3tgmxi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200320195139/http://cas.ee.ic.ac.uk/people/gac1/pubs/AbidTPDS13.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/2f/71/2f7184619298cd5f5dbf4f2ed877551a5c4b0846.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/tpds.2014.6"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Communication-Avoiding Optimization Methods for Distributed Massive-Scale Sparse Inverse Covariance Estimation [article]

Penporn Koanantakool, Alnur Ali, Ariful Azad, Aydin Buluc, Dmitriy Morozov, Leonid Oliker, Katherine Yelick, Sang-Yun Oh
<span title="2018-04-08">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Across a variety of scientific disciplines, sparse inverse covariance estimation is a popular tool for capturing the underlying dependency relationships in multivariate data.  ...  Our parallel proximal gradient method uses a novel communication-avoiding linear algebra algorithm and runs across a multi-node cluster with up to 1k nodes (24k cores), achieving parallel scalability on  ...  All the matrix products can be computed using the communication-avoiding algorithm for dense-dense and sparse-dense matrix multiplication that we present below, while the transpose can be computed via  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1710.10769v2">arXiv:1710.10769v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/qxqwez6afjg4bkifnx3p5jrkwa">fatcat:qxqwez6afjg4bkifnx3p5jrkwa</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20191013154641/https://arxiv.org/pdf/1710.10769v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/fe/26/fe268b9062b6ba1978485948900b68c6edbdfab8.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1710.10769v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

The Combinatorial BLAS: design, implementation, and applications

Aydın Buluç, John R Gilbert
<span title="2011-05-19">2011</span> <i title="SAGE Publications"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/prmoui3bz5a2jktzh4pt7dtery" style="color: black;">The international journal of high performance computing applications</a> </i> &nbsp;
Large combinatorial graphs appear in many applications of high-performance computing, including computational biology, informatics, analytics, web search, dynamical systems, and sparse matrix methods.  ...  The library is evaluated using two important graph algorithms, in terms of both performance and ease-ofuse.  ...  For example, communication can be overlapped with computation in the SpGEMM function by prefetching the internal arrays through one sided communication.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1177/1094342011403516">doi:10.1177/1094342011403516</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/hyplcu2f2vco3oj4fv3e3hdevq">fatcat:hyplcu2f2vco3oj4fv3e3hdevq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20110401104019/http://www.cs.ucsb.edu/research/tech_reports/reports/2010-18.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c7/f8/c7f82b511ef5bb5c87f8d057e022fe3bd812fcdc.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1177/1094342011403516"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> sagepub.com </button> </a>

Preparing sparse solvers for exascale computing

Hartwig Anzt, Erik Boman, Rob Falgout, Pieter Ghysels, Michael Heroux, Xiaoye Li, Lois Curfman McInnes, Richard Tran Mills, Sivasankaran Rajamanickam, Karl Rupp, Barry Smith, Ichitaro Yamazaki (+1 others)
<span title="2020-01-20">2020</span> <i title="The Royal Society"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ercgg4vn2fenngurcnadfzdfri" style="color: black;">Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences</a> </i> &nbsp;
This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms.  ...  Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms.  ...  One way to address assembly and sparse data structure challenges is the ever-attractive approach of avoiding explicit formation of sparse matrix representations and instead use matrixfree formulations  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1098/rsta.2019.0053">doi:10.1098/rsta.2019.0053</a> <a target="_blank" rel="external noopener" href="https://www.ncbi.nlm.nih.gov/pubmed/31955673">pmid:31955673</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/bqw6xqixbrabddmxglmtcbw2wa">fatcat:bqw6xqixbrabddmxglmtcbw2wa</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200507183315/https://escholarship.org/content/qt0r56p10n/qt0r56p10n.pdf?t=q733f0" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e1/02/e102222e5749c041aa26fe66670353a9ab94c923.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1098/rsta.2019.0053"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 86,182 results