.

Optimal Sparse Matrix Dense Vector Multiplication in the I/O-Model

2010
Theory of Computing Systems
for A

doi:10.1007/s00224-010-9285-4
fatcat:24w4kli6hjejjifrrytj2ej3iy
*in*column major layout. ...*The**I*/*O*-*model**The**I*/*O*-*model*is*the*natural extension of*the**I*/*O*-*model*of Aggarwal and Vitter [1] to*the*situation of*matrix*-*vector**multiplication*. ... Abstract We study*the*problem of*sparse*-*matrix**dense*-*vector**multiplication*(SpMV)*in*external memory.*The*task of SpMV is to compute y := Ax, where A is a*sparse*N × N*matrix*and x is a*vector*. ...##
Optimal sparse matrix dense vector multiplication in the I/O-model

2007
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures - SPAA '07
for A

doi:10.1145/1248377.1248391
dblp:conf/spaa/BenderBFJV07
fatcat:cteafwb36vd5dodxkwknbf3kte
Evaluating Non-square Sparse Bilinear Forms on Multiple Vector Pairs in the I/O-Model
[chapter]

2010
Lecture Notes in Computer Science
We consider evaluating one bilinear form defined by a

doi:10.1007/978-3-642-15155-2_35
fatcat:cd3hoywjwbc6hh73vkwaflpvla
*sparse*Ny × Nx*matrix*A having h entries on w pairs of*vectors**The*model of computation is*the*semiring*I*/*O*-*model*with main memory size M and block ... To this end, we present asymptotically*optimal*algorithms and matching lower bounds. Moreover, we show that multiplying*the**matrix*A with w*vectors*has*the*same worst-case I/O-complexity. ... Considering*the*evaluation of*matrix**vector*products on*multiple**vectors*is a step towards closing*the*gap between*sparse**matrix**vector**multiplication*and*sparse**matrix**dense**matrix**multiplication*since ...##
###
Fine-grained I/O Complexity via Reductions: New Lower Bounds, Faster Algorithms, and a Time Hierarchy

2018
Innovations in Theoretical Computer Science
We generate new I/O assumptions based on

doi:10.4230/lipics.itcs.2018.34
dblp:conf/innovations/DemaineLLLW18
fatcat:dwdccig625e2zfrm6yxrxrnor4
*the*difficulty of improving*sparse*graph problem running times*in**the**I*/*O**model*. ... From these*I*/*O*-*model*assumptions, we show that many of*the*known reductions*in**the*word-RAM model can naturally extend to hold*in**the**I*/*O**model*as well (e.g., a lower bound on*the*I/O complexity of Longest ... We thank*the*anonymous reviewers for their helpful suggestions. ...##
###
The Input/Output Complexity of Sparse Matrix Multiplication
2014
Lecture Notes in Computer Science
*In*

*the*classical paper of Hong and Kung (STOC '81) it was shown that to compute a product of

*dense*U ×U matrices, Θ U 3 /(B √ M ) I/Os are necessary and sufficient

*in*

*the*

*I*/

*O*

*model*with internal memory ... While our lower bound uses fairly standard techniques,

*the*upper bound makes use of "compressed

*matrix*

*multiplication*" sketches, which is new

*in*

*the*context of I/O-efficient algorithms, and a new

*matrix*...

*In*

*the*

*I*/

*O*

*model*introduced by Aggarwal and Vitter [1]

*the*

*optimal*

*matrix*

*multiplication*algorithm for

*the*

*dense*case already existed (see Section 1.2) and since then

*sparse*-

*dense*and

*sparse*-

*sparse*combinations ...

##
###
The Input/Output Complexity of Sparse Matrix Multiplication
2014
arXiv
*In*

*the*classical paper of Hong and Kung (STOC '81) it was shown that to compute a product of

*dense*U × U matrices, Θ(U^3 / (B √(M)) ) I/Os are necessary and sufficient

*in*

*the*

*I*/

*O*

*model*with internal memory ... While our lower bound uses fairly standard techniques,

*the*upper bound makes use of "compressed

*matrix*

*multiplication*" sketches, which is new

*in*

*the*context of I/O-efficient algorithms, and a new

*matrix*... Since Z/2 outputs are needed either

*in*

*the*direct or

*the*indirect way,

*the*number of I/Os needed becomes

*the*minimum of

*the*two lower bounds we get Theorem 2. ...

##
###
Randomized Primitives for Big Data Processing

2017
Künstliche Intelligenz
Our results

doi:10.1007/s13218-017-0515-7
fatcat:3ffl5jvmq5ayte2tzmpir7jz4e
*in**the**I*/*O**model*can be summarized as:*In**the*RAM model we seek to use fast*matrix**multiplication*, i.e.,*the*(U ) method to compute fast*matrix*products. ...*In**the**I*/*O**model*we settle*the*I/O complexity of*sparse**matrix**multiplication*: We provide a new algorithm and a matching lower bound, both parameterized on input sparsity N and output sparsity Z. ...##
###
Foreword

2010
Theory of Computing Systems
on

doi:10.1007/s00224-010-9284-5
fatcat:twbodq7n7ngkvmxxppywh2pmny
*the*distributed complexity and local approximability of*the*Capacitated Dominating Set Problem. • "Strong-Diameter Decompositions of Minor Free Graphs" provides*the*first*sparse*covers and probabilistic ... for Multiprocessor Scheduling under Uncertainty" presents polynomial-time approximation algorithms for*the*multiprocessor scheduling problem*in*scenarios where there is uncertainty*in**the*successful execution ...*Multiplication**in**the**I*/*O*-*Model*" investigates*the*worst-case complexity*in*terms of*the*number of I/Ooperations*in*order to multiply a*matrix*with a*vector*that are stored*in*some external memory. ...##
###
Fine-Grained I/O Complexity via Reductions: New lower bounds, faster algorithms, and a time hierarchy
2017
arXiv
We generate new I/O assumptions based on

arXiv:1711.07960v3
fatcat:i7sgttq64rfclpnoibguqiqduy
*the*difficulty of improving*sparse*graph problem running times*in**the**I*/*O**model*. ... Finally, we prove an analog of*the*Time Hierarchy Theorem*in**the**I*/*O**model*. ... Acknowledgements We thank*the*anonymous reviewers for their helpful suggestions. ...##
###
An out-of-core implementation of the COLUMBUS massively-parallel multireference configuration interaction program

1998
Proceedings of the IEEE/ACM SC98 Conference
From

doi:10.1109/sc.1998.10027
dblp:conf/sc/DachselNH98
fatcat:pzp6v2riwze6tppwbjxhasi3vq
*the*mathematical perspective,*the*program solves*the*eigenvalue problem for a very large,*sparse*, symmetric Hamilton*matrix*. ... techniques on*vector*supercomputers. ... To support*the*noncollective*I*/*O**model*of Shared Files we developed*the*Distant*I*/*O**model*[11] . ...##
###
The I/O Complexity of Sparse Matrix Dense Matrix Multiplication
2010
Lecture Notes in Computer Science
We consider

doi:10.1007/978-3-642-12200-2_14
fatcat:ihi6o55ie5ggbjli4rqlpxvy2y
*the**multiplication*of a*sparse*N × N*matrix*A with a*dense*N × N*matrix*B*in**the**I*/*O**model*. ...*sparse**matrix*) that use only a constant factor more I/Os. ...*In*this paper, we determine*the*complexity to multiply two N ×N matrices, one of which is k-*sparse**in**the*semiring*I*/*O*-*model*with tall cache M ≥ B 2 to be Θ max kN 2 B∆ , kN 2 B √ M , N 2 B , 1 where ∆ ...##
###
Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format
2020
arXiv
*Multiplication*of a

*sparse*

*matrix*to a

*dense*

*matrix*(SpDM) is widely used

*in*many areas like scientific computing and machine learning. ...

*The*storage data structures help

*sparse*matrices store

*in*a memory-saving format, but they bring difficulties

*in*

*optimizing*

*the*performance of SpDM on modern GPUs due to irregular data access of

*the*

*sparse*... Regarding

*the*SpDM algorithm analysis, Greiner et al. [36] propose an

*I*/

*O*

*model*to interpret

*the*lower bound of efficient serial algorithms. ...

##
###
SLACID - sparse linear algebra in a column-oriented in-memory database system

2014
Proceedings of the 26th International Conference on Scientific and Statistical Database Management - SSDBM '14
*In*this paper, we present a

*sparse*

*matrix*chain

*optimizer*(SpMachO) that creates an execution plan, which is composed of

*multiplication*operators and transformations between

*sparse*and

*dense*

*matrix*storage ... We introduce a comprehensive cost model for

*sparse*-,

*dense*-and hybrid

*multiplication*kernels. ... Hence, mainly

*the*memory accesses contribute to

*the*runtime of an algorithm, which is

*in*conformance to

*the*external memory (

*I*/

*O*)

*model*. ...

##
###
ICE: A General and Validated Energy Complexity Model for Multithreaded Algorithms
2016
arXiv
*The*new model is validated by different

*sparse*

*matrix*

*vector*

*multiplication*(SpMV) algorithms and

*dense*

*matrix*

*multiplication*(matmul) algorithms running on high performance computing (HPC) platforms ( ...

*In*order to improve

*the*usability and accuracy of

*the*new model for a wide range of platforms,

*the*platform parameters of ICE model are provided for eleven platforms including HPC, accelerator and embedded ...

*The*I/O complexity of CSC

*in*

*the*sequential

*I*/

*O*

*model*of columnmajor layout is O(nz) [5]. Similar to CSR, scanning all non-zero elements of

*matrix*A

*in*CSC format costs O( nz B ) I/Os. ...

##
###
A Memory Model for Scientific Algorithms on Graphics Processors

2006
ACM/IEEE SC 2006 Conference (SC'06)
*In*order to demonstrate

*the*effectiveness of our model, we highlight its performance on three memory-intensive scientific applications -sorting, fast Fourier transform and

*dense*

*matrix*-

*multiplication*. ...

*In*practice, we are able to achieve 2-5× performance improvement. ... Acknowledgements This work is supported

*in*part by ARO Contracts DAAD19-02-1-0390 and W911NF-04-1-0088, NSF awards 0400134 and 0118743, DARPA/RDECOM Contract N61339-04-C-0043, ONR Contract N00014-01-1- ...

