A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is `application/pdf`

.

## Filters

##
###
Solving Tall Dense Linear Programs in Nearly Linear Time
[article]

2021
*
arXiv
*
pre-print

Interestingly, we obtain this running

arXiv:2002.02304v2
fatcat:palnpmhb2rdsrp6yptsshobbpu
*time*without using fast matrix multiplication and consequently, barring a major advance*in**linear*system*solving*, our running*time*is near optimal for*solving**dense*...*In*this paper we provide an Õ(nd+d^3)*time*randomized algorithm for*solving**linear**programs*with d variables and n constraints with high probability. ... Acknowledgements We thank Sébastien Bubeck, Ofer Dekel, Jerry Li, Ilya Razenshteyn, and Microsoft Research for facilitating conversations between and hosting researchers involved*in*this collaboration. ...##
###
Mapping Dense LU Factorization on Multicore Supercomputer Nodes

2012
*
2012 IEEE 26th International Parallel and Distributed Processing Symposium
*

*Dense*LU factorization is a prominent benchmark used to rank the performance of supercomputers. ... A blockcyclic mapping

*in*the row-major order does not encounter this problem, but consequently sacrifices node and network locality

*in*the critical pivoting steps. ... Running

*time*on Jaguar, a resource of the National Center for Computational Sciences at Oak Ridge National Laboratory, was supported by DOE contract DE-AC05-00OR22725. ...

##
###
Tall and skinny QR factorizations in MapReduce architectures

2011
*
Proceedings of the second international workshop on MapReduce and its applications - MapReduce '11
*

We present an implementation of the

doi:10.1145/1996092.1996103
fatcat:4ykdtyl6ezeppgz6z3mhkmtwzy
*tall*and skinny QR (TSQR) factorization*in*the Map-Reduce framework, and we provide computational results for*nearly*terabyte-sized datasets. ... These tasks run*in*just a few minutes under a variety of parameter choices. ... We would also like to thank James Demmel for suggesting examining the reference streaming*time*. ...##
###
Kokkos Kernels: Performance Portable Sparse/Dense Linear Algebra and Graph Kernels
[article]

2021
*
arXiv
*
pre-print

We describe Kokkos Kernels, a library of kernels for sparse

arXiv:2103.11991v1
fatcat:m7iskgt5kjdjnex7lenqsjj6z4
*linear*algebra,*dense**linear*algebra and graph kernels. ... As hardware architectures are evolving*in*the push towards exascale, developing Computational Science and Engineering (CSE) applications depend on performance portable approaches for sustainable software ... However, no portable solution existed for sparse/*dense**linear*algebra kernels before the Kokkos Kernels library was created as part of the Advanced Technology Development and Mitigation*program*and the ...##
###
Analysis of a Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards

2017
*
SIAM Journal on Scientific Computing
*

We discuss an approach for

doi:10.1137/15m1039523
fatcat:oblxmqbzpbcvjc4mbslummq6vi
*solving*sparse or*dense*banded*linear*systems Ax = b on a graphics processing unit (GPU) card. ...*In*a comparison against Intel's MKL, SaP::GPU also fared well when used to*solve**dense*banded systems that are close to being diagonally dominant. ... Note that K selection is relevant only*in*the context of*solving*sparse*linear*systems; for*dense*banded, K is a given. ...##
###
Row Modifications of a Sparse Cholesky Factorization

2005
*
SIAM Journal on Matrix Analysis and Applications
*

Additional illustrations can be found

doi:10.1137/s089547980343641x
fatcat:64pphvso4vctvkptvofyoajgba
*in*[12] . A variety of techniques for modifying a*dense*Cholesky factorization are given*in*the classic reference [11] . ... We also determine how the solution of a*linear*system Lx = b changes after changing a row and column of C or after a rank-r change*in*C. where e i is the ith column of the identity matrix. ...*In*each iteration of the*linear**programming*dual active set algorithm (LPDASA) (see [5, 13, 14, 15, 16, 17] ), we*solve*a symmetric*linear*system of the form Cλ = f , C = A F A T F + σI, where σ > 0 is ...##
###
Accelerating an Iterative Eigensolver for Nuclear Structure Configuration Interaction Calculations on GPUs using OpenACC
[article]

2021
*
arXiv
*
pre-print

step and

arXiv:2109.00485v1
fatcat:4f243aknszg43ljupfwxrqfxnu
*dense**linear*algebra operations. ... (GPUs), we modified a previously developed hybrid MPI/OpenMP implementation of an eigensolver written*in*FORTRAN 90 by using an OpenACC directives based*programming*model. ... This work was supported*in*part by the U. ...##
###
Communication-Avoiding QR Decomposition for GPUs

2011
*
2011 IEEE International Parallel & Distributed Processing Symposium
*

As a result, we outperform CULA, a parallel

doi:10.1109/ipdps.2011.15
dblp:conf/ipps/AndersonBDK11
fatcat:if32u2vn7natnmmnskpftflnua
*linear*algebra library for GPUs, by up to 13x for*tall*-skinny matrices. ... We show that the reduction*in*memory traffic provided by CAQR allows us to outperform existing parallel GPU implementations of QR for a large class of*tall*-skinny matrices. ... The most common example is*linear*least squares, which is ubiquitous*in**nearly*all branches of science and engineering and can be*solved*using QR. ...##
###
Gravitational Instabilities in the Disks of Massive Protostars as an Explanation for Linear Distributions of Methanol Masers

2001
*
Astrophysical Journal
*

This is particularly true for methanol (CH 3 OH), for which

doi:10.1086/338738
fatcat:u2ixysyckjaopitjmibmothwju
*linear*distributions of masers are found with disklike kinematics. ... instabilities leads to a complex of intersecting spiral shocks, clumps, and arclets within the disk and to significant*time*-dependent, nonaxisymmetric distortions of the disk surface. ... Note the*tall*ridges of material*in*the arms. ...##
###
Large Scale Distributed Linear Algebra With Tensor Processing Units
[article]

2021
*
arXiv
*
pre-print

Via curated algorithms emphasizing large, single-core matrix multiplications, other tasks

arXiv:2112.09017v1
fatcat:ahdbdepkq5ajjc7bcjay5dfaj4
*in**dense**linear*algebra can similarly scale. ... can multiply two matrices with*linear*size N= 220= 1 048 576*in*about 2 minutes. ... We consider the case of A given as a*dense*, full-rank matrix,*in*which case ( 7 ) is typically*solved**in*O(N 3 ) operations via an initial LU decomposition. ...##
###
Block Iterative Methods and Recycling for Improved Scalability of Linear Solvers

2016
*
SC16: International Conference for High Performance Computing, Networking, Storage and Analysis
*

This work has been supported

doi:10.1109/sc.2016.16
dblp:conf/sc/JolivetT16
fatcat:dhydaneaarcyfgjhvjwcmdofzu
*in*part by ANR through project MEDIMAX, ANR-13-MONU-0012. The first author was partially funded by the French Association of Mechanics (AFM) for this work. ...*In*the original paper, each*linear*system is*solved*with a single right-hand side, i.e. p = 1. ... Non-variable*linear*systems For some*time*-dependent PDEs, it is necessary to*solve*sequences of*linear*systems where the operator is the same throughout the sequence, and only the right-hand sides are ...##
###
Efficient Methods for Out-of-Core Sparse Cholesky Factorization

1999
*
SIAM Journal on Scientific Computing
*

We find that straightforward implementations of all of them suffer from excessive disk I/O for large problems that arise

doi:10.1137/s1064827597322975
fatcat:t5kpc3qcezhn3pf2e2m6jkgd5u
*in*interiorpoint algorithms for*linear**programming*. ... We nd that straightforward implementations of all of them su er from excessive disk I O for large problems that arise*in*interior-point algorithms for*linear**programming*. ... The e ect is that extremely large sparse*linear*systems can be*solved**in*reasonable*time*on very inexpensive systems. ...##
###
Parallel distributed-memory simplex for large-scale stochastic LP problems

2013
*
Computational optimization and applications
*

We present a parallelization of the revised simplex method for large extensive forms of two-stage stochastic

doi:10.1007/s10589-013-9542-y
fatcat:obkds5dx6fdbddx55ncxpr3lym
*linear**programming*(LP) problems. ... It is built on novel analysis of the*linear*algebra for dual block-angular LP problems when*solved*by using the revised simplex method and a novel parallel scheme for applying product-form updates. ... Total*time**in*PRICE per iteration is given on the right. the path for efficiently*solving*stochastic*programming*problems*in*these two contexts. ...##
###
Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software

2004
*
SIAM Review
*

*In*fact, the whole gamut of existing

*dense*

*linear*algebra factorization is beginning to be reexamined

*in*view of the recursive paradigm. ... Novel recursive blocked algorithms offer new ways to compute factorizations such as Cholesky and QR and to

*solve*matrix equations. ... Some of the main points are the following: • Recursion creates new algorithms for

*linear*algebra software. • Recursion can be used to express

*dense*

*linear*algebra algorithms entirely

*in*terms of level ...

##
###
A high-performance parallel algorithm for nonnegative matrix factorization

2016
*
Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - PPoPP '16
*

It maintains the data and factor matrices

doi:10.1145/2851141.2851152
dblp:conf/ppopp/KannanBP16
fatcat:udekzdd7ffgqhajv3apnbbxfmi
*in*memory (distributed across processors), uses MPI for interprocessor communication, and,*in*the*dense*case, provably minimizes communication costs (under mild ... Despite its popularity*in*the data mining community, there is a lack of efficient distributed algorithms to*solve*the problem for big data sets. ... We also thank NSF for the travel grant to present this work*in*the conference through the grant CCF-1552229. ...
« Previous

*Showing results 1 — 15 out of 4,320 results*