Filters








58 Hits in 8.5 sec

Optimal Sparse Matrix Dense Vector Multiplication in the I/O-Model

Michael A. Bender, Gerth Stølting Brodal, Rolf Fagerberg, Riko Jacob, Elias Vicari
2010 Theory of Computing Systems  
for A in column major layout.  ...  The I/O-model The I/O-model is the natural extension of the I/O-model of Aggarwal and Vitter [1] to the situation of matrix-vector multiplication.  ...  Abstract We study the problem of sparse-matrix dense-vector multiplication (SpMV) in external memory. The task of SpMV is to compute y := Ax, where A is a sparse N × N matrix and x is a vector.  ... 
doi:10.1007/s00224-010-9285-4 fatcat:24w4kli6hjejjifrrytj2ej3iy

Optimal sparse matrix dense vector multiplication in the I/O-model

Michael A. Bender, Gerth Stølting Brodal, Rolf Fagerberg, Riko Jacob, Elias Vicari
2007 Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures - SPAA '07  
for A in column major layout.  ...  The I/O-model The I/O-model is the natural extension of the I/O-model of Aggarwal and Vitter [1] to the situation of matrix-vector multiplication.  ...  Abstract We study the problem of sparse-matrix dense-vector multiplication (SpMV) in external memory. The task of SpMV is to compute y := Ax, where A is a sparse N × N matrix and x is a vector.  ... 
doi:10.1145/1248377.1248391 dblp:conf/spaa/BenderBFJV07 fatcat:cteafwb36vd5dodxkwknbf3kte

Evaluating Non-square Sparse Bilinear Forms on Multiple Vector Pairs in the I/O-Model [chapter]

Gero Greiner, Riko Jacob
2010 Lecture Notes in Computer Science  
We consider evaluating one bilinear form defined by a sparse Ny × Nx matrix A having h entries on w pairs of vectors The model of computation is the semiring I/O-model with main memory size M and block  ...  To this end, we present asymptotically optimal algorithms and matching lower bounds. Moreover, we show that multiplying the matrix A with w vectors has the same worst-case I/O-complexity.  ...  Considering the evaluation of matrix vector products on multiple vectors is a step towards closing the gap between sparse matrix vector multiplication and sparse matrix dense matrix multiplication since  ... 
doi:10.1007/978-3-642-15155-2_35 fatcat:cd3hoywjwbc6hh73vkwaflpvla

Fine-grained I/O Complexity via Reductions: New Lower Bounds, Faster Algorithms, and a Time Hierarchy

Erik D. Demaine, Andrea Lincoln, Quanquan C. Liu, Jayson Lynch, Virginia Vassilevska Williams, Marc Herbstritt
2018 Innovations in Theoretical Computer Science  
We generate new I/O assumptions based on the difficulty of improving sparse graph problem running times in the I/O model.  ...  From these I/O-model assumptions, we show that many of the known reductions in the word-RAM model can naturally extend to hold in the I/O model as well (e.g., a lower bound on the I/O complexity of Longest  ...  We thank the anonymous reviewers for their helpful suggestions.  ... 
doi:10.4230/lipics.itcs.2018.34 dblp:conf/innovations/DemaineLLLW18 fatcat:dwdccig625e2zfrm6yxrxrnor4

The Input/Output Complexity of Sparse Matrix Multiplication [chapter]

Rasmus Pagh, Morten Stöckel
2014 Lecture Notes in Computer Science  
In the classical paper of Hong and Kung (STOC '81) it was shown that to compute a product of dense U ×U matrices, Θ U 3 /(B √ M ) I/Os are necessary and sufficient in the I/O model with internal memory  ...  While our lower bound uses fairly standard techniques, the upper bound makes use of "compressed matrix multiplication" sketches, which is new in the context of I/O-efficient algorithms, and a new matrix  ...  In the I/O model introduced by Aggarwal and Vitter [1] the optimal matrix multiplication algorithm for the dense case already existed (see Section 1.2) and since then sparse-dense and sparse-sparse combinations  ... 
doi:10.1007/978-3-662-44777-2_62 fatcat:sxodluuy5vbcbhznka577x5r3e

The Input/Output Complexity of Sparse Matrix Multiplication [article]

Rasmus Pagh, Morten Stöckel
2014 arXiv   pre-print
In the classical paper of Hong and Kung (STOC '81) it was shown that to compute a product of dense U × U matrices, Θ(U^3 / (B √(M)) ) I/Os are necessary and sufficient in the I/O model with internal memory  ...  While our lower bound uses fairly standard techniques, the upper bound makes use of "compressed matrix multiplication" sketches, which is new in the context of I/O-efficient algorithms, and a new matrix  ...  Since Z/2 outputs are needed either in the direct or the indirect way, the number of I/Os needed becomes the minimum of the two lower bounds we get Theorem 2.  ... 
arXiv:1403.3551v1 fatcat:ek3lgvnx75hyjoujqhqo5inzda

Randomized Primitives for Big Data Processing

Morten Stöckel
2017 Künstliche Intelligenz  
Our results in the I/O model can be summarized as: In the RAM model we seek to use fast matrix multiplication, i.e., the (U ) method to compute fast matrix products.  ...  In the I/O model we settle the I/O complexity of sparse matrix multiplication: We provide a new algorithm and a matching lower bound, both parameterized on input sparsity N and output sparsity Z.  ... 
doi:10.1007/s13218-017-0515-7 fatcat:3ffl5jvmq5ayte2tzmpir7jz4e

Foreword

Cyril Gavoille, Boaz Patt-Shamir, Christian Scheideler
2010 Theory of Computing Systems  
on the distributed complexity and local approximability of the Capacitated Dominating Set Problem. • "Strong-Diameter Decompositions of Minor Free Graphs" provides the first sparse covers and probabilistic  ...  for Multiprocessor Scheduling under Uncertainty" presents polynomial-time approximation algorithms for the multiprocessor scheduling problem in scenarios where there is uncertainty in the successful execution  ...  Multiplication in the I/O-Model" investigates the worst-case complexity in terms of the number of I/Ooperations in order to multiply a matrix with a vector that are stored in some external memory.  ... 
doi:10.1007/s00224-010-9284-5 fatcat:twbodq7n7ngkvmxxppywh2pmny

Fine-Grained I/O Complexity via Reductions: New lower bounds, faster algorithms, and a time hierarchy [article]

Erik D. Demaine, Andrea Lincoln, Quanquan C. Liu, Jayson Lynch, Virginia Vassilevska Williams
2017 arXiv   pre-print
We generate new I/O assumptions based on the difficulty of improving sparse graph problem running times in the I/O model.  ...  Finally, we prove an analog of the Time Hierarchy Theorem in the I/O model.  ...  Acknowledgements We thank the anonymous reviewers for their helpful suggestions.  ... 
arXiv:1711.07960v3 fatcat:i7sgttq64rfclpnoibguqiqduy

An out-of-core implementation of the COLUMBUS massively-parallel multireference configuration interaction program

H. Dachsel, J. Nieplocha, R. Harrison
1998 Proceedings of the IEEE/ACM SC98 Conference  
From the mathematical perspective, the program solves the eigenvalue problem for a very large, sparse, symmetric Hamilton matrix.  ...  techniques on vector supercomputers.  ...  To support the noncollective I/O model of Shared Files we developed the Distant I/O model [11] .  ... 
doi:10.1109/sc.1998.10027 dblp:conf/sc/DachselNH98 fatcat:pzp6v2riwze6tppwbjxhasi3vq

The I/O Complexity of Sparse Matrix Dense Matrix Multiplication [chapter]

Gero Greiner, Riko Jacob
2010 Lecture Notes in Computer Science  
We consider the multiplication of a sparse N × N matrix A with a dense N × N matrix B in the I/O model.  ...  sparse matrix) that use only a constant factor more I/Os.  ...  In this paper, we determine the complexity to multiply two N ×N matrices, one of which is k-sparse in the semiring I/O-model with tall cache M ≥ B 2 to be Θ max kN 2 B∆ , kN 2 B √ M , N 2 B , 1 where ∆  ... 
doi:10.1007/978-3-642-12200-2_14 fatcat:ihi6o55ie5ggbjli4rqlpxvy2y

Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format [article]

Shaohuai Shi, Qiang Wang, Xiaowen Chu
2020 arXiv   pre-print
Multiplication of a sparse matrix to a dense matrix (SpDM) is widely used in many areas like scientific computing and machine learning.  ...  The storage data structures help sparse matrices store in a memory-saving format, but they bring difficulties in optimizing the performance of SpDM on modern GPUs due to irregular data access of the sparse  ...  Regarding the SpDM algorithm analysis, Greiner et al. [36] propose an I/O model to interpret the lower bound of efficient serial algorithms.  ... 
arXiv:2005.14469v1 fatcat:42n64i7olvh33lcpgx2z6j4rvu

SLACID - sparse linear algebra in a column-oriented in-memory database system

David Kernert, Frank Köhler, Wolfgang Lehner
2014 Proceedings of the 26th International Conference on Scientific and Statistical Database Management - SSDBM '14  
In this paper, we present a sparse matrix chain optimizer (SpMachO) that creates an execution plan, which is composed of multiplication operators and transformations between sparse and dense matrix storage  ...  We introduce a comprehensive cost model for sparse-, dense-and hybrid multiplication kernels.  ...  Hence, mainly the memory accesses contribute to the runtime of an algorithm, which is in conformance to the external memory (I/O) model.  ... 
doi:10.1145/2618243.2618254 dblp:conf/ssdbm/KernertKL14 fatcat:nwchplfmhvhpli5jhlc656utxy

ICE: A General and Validated Energy Complexity Model for Multithreaded Algorithms [article]

Vi Ngoc-Nha Tran, Phuong Hoai Ha
2016 arXiv   pre-print
The new model is validated by different sparse matrix vector multiplication (SpMV) algorithms and dense matrix multiplication (matmul) algorithms running on high performance computing (HPC) platforms (  ...  In order to improve the usability and accuracy of the new model for a wide range of platforms, the platform parameters of ICE model are provided for eleven platforms including HPC, accelerator and embedded  ...  The I/O complexity of CSC in the sequential I/O model of columnmajor layout is O(nz) [5]. Similar to CSR, scanning all non-zero elements of matrix A in CSC format costs O( nz B ) I/Os.  ... 
arXiv:1605.08222v2 fatcat:7nq6oxv2p5gjjoyfanp5pw4qsm

A Memory Model for Scientific Algorithms on Graphics Processors

Naga Govindaraju, Scott Larsen, Jim Gray, Dinesh Manocha
2006 ACM/IEEE SC 2006 Conference (SC'06)  
In order to demonstrate the effectiveness of our model, we highlight its performance on three memory-intensive scientific applications -sorting, fast Fourier transform and dense matrix-multiplication.  ...  In practice, we are able to achieve 2-5× performance improvement.  ...  Acknowledgements This work is supported in part by ARO Contracts DAAD19-02-1-0390 and W911NF-04-1-0088, NSF awards 0400134 and 0118743, DARPA/RDECOM Contract N61339-04-C-0043, ONR Contract N00014-01-1-  ... 
doi:10.1109/sc.2006.2 fatcat:khuppw4ib5bltfweweg6dhx22i
« Previous Showing results 1 — 15 out of 58 results