A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
Filters
Y-MP FLOATING POINT AND CHOLESKY FACTORIZATION
1991
International journal of high speed computing
The size of the residual can be related to the size of the of A and : ! ...
Q 4 T " l d u G a 8 R T f T W ¦ Û r = b -A (compute residual) solve GG T e = r for e (using the previously computed factors) = + e § ¦ W ¦ u $ & ) W ¦ E X Y # $ u ¦ E F 3 t u G a V 3 t W ¤ d V & ) 3 H ...
doi:10.1142/s0129053391000103
fatcat:v3b7guqr2vhhrmhfn6ymhui64u
SOLVING LARGE SCALE LINEAR PROGRAMMING PROBLEMS USING AN INTERIOR POINT METHOD ON A MASSIVELY PARALLEL SIMD COMPUTER
1994
Parallel Algorithms and Applications
At the heart of the implementation is a parallel Cholesky factorization algorithm for sparse matrices. ...
Our implementation uses a new scheme of mapping the matrix onto the processor grid of the MasPar, that results in a more efficient Cholesky factorization than previously suggested schemes. ...
Jones and Mr. C. Tong of the MRC Cyclotron Unit of Hammersmith Hospital for supplying the PET models and for working closely with us in the solution of those problems. ...
doi:10.1080/10637199408915470
fatcat:3opyqbllebenrbqwltncynnita
A Supernodal Cholesky Factorization Algorithm for Shared-Memory Multiprocessors
1993
SIAM Journal on Scientific Computing
A parallel supernodal Cholesky factorization algorithm will be presented in Section 3 as well. Section 4 provides experimental results on an IBM RS/6000, a Cray Y-MP, and a Sequent Balance 8000. ...
Table 4 . 2 . 42 2A single floating-point operation is either a floating-point addition or a floating-point multiplicatim, and is denoted by "flop". ............ -. .___ ___ -. ....... ...... yn-oblcm ...
doi:10.1137/0914048
fatcat:neyhsiudkngmfpkp4gfl3c5sau
Block Sparse Cholesky Algorithms on Advanced Uniprocessor Computers
1993
SIAM Journal on Scientific Computing
Acknowledgments The authors thank the Minnesota Supercomputer Institute and Cray Research, Inc. for providing computer support. ...
Also, part of the work was performed while the authors were visiting the Institute for Mathematics and its Applications at the University of Minnesota, which is funded principally by the National Science ...
All algorithms were coded in Fortran and all floating-point operations were performed in double precision, except on the Cray Y-MP. ...
doi:10.1137/0914063
fatcat:e3zz4osnljeiloyfzifveeqe2u
Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices Using GPUs
[chapter]
2020
Lecture Notes in Computer Science
The solver is based on a mixed-precision Cholesky factorization that utilizes the high-performance tensor core units in CUDA-enabled GPUs. ...
Since the Cholesky factors are affected by the low precision, an iterative refinement (IR) solver is required to recover the solution back to double-precision accuracy. ...
Throughout the paper, we assume that b is an N × 1 vector, and so the triangular solve step requires O(N 2 ) floating-point operations (FLOPs). ...
doi:10.1007/978-3-030-50417-5_18
fatcat:mcx73t5pmzhb3n6rcccqoem5ja
A parallel-vector algorithm for rapid structural analysis on high-performance computers
1994
Computers & structures
This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. ...
The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. ...
A record is kept of number of floating point operations performed by each processor to factor and solve the matrix (totf, tots) as well as the elapsed (et0-et5) and task CPU time (t0-t5) on each processor ...
doi:10.1016/0045-7949(94)90057-4
fatcat:mz26tgvznjbr3fmtfom5v6orly
Solution of large, dense symmetric generalized eigenvalue problems using secondary storage
1988
ACM Transactions on Mathematical Software
A combination of block Cholesky and block Householder transformations are used to reduce the problem to a symmetric banded eigenproblem whose eigenvalues can be computed in central memory. ...
Here A is assumed to be symmetric, and B symmetric positive definite. ...
ACKNOWLEDGMENT We would like to thank Dan Pierce for modifying TREEQL and for carrying out the related numerical experiments. ...
doi:10.1145/44128.44130
fatcat:x3mmxvycpjgejf3rvdmqirdtpm
Squeezing the most out of an algorithm in CRAY FORTRAN
1984
ACM Transactions on Mathematical Software
The technique can be applied to a wide variety of algorithms m hnear algebra, and is beneficial m other architectural settings. 1MFLOPS is an acronym for million floating-point operations (adchtions or ...
ACKNOWLEDGMENTS We would like to thank the National Magnetic Fusion Energy Computer Center for providing computer time to carry out some of the experiments, Cray Research for their cooperation, and Alan ...
CFT has six vector memory references for each eight vector floating-point operations. ...
doi:10.1145/1271.319413
fatcat:nhce2bm655fbbpmzr6fdqykhxu
Early Experience With the Intel iPsc/860 At Oak Ridge National Laboratory
1991
The International Journal of Supercomputing Applications
and data
formance, and 60 Mfiops 64-bit floating-point performance. ...
M¢chine
Processors
Mfiops
Comments
DE( 3100
1
2
IBM RS/6000
1
18 Model 530
Cray-2
1
49
Cray Y-MP
1
130 Fortran only
Cray Y-MP
1
203 assembler BLAS
Cray Y-MP
8
1509 multitasking ...
doi:10.1177/109434209100500202
fatcat:als6nyzugrfzrhgrrgujkqfjhq
Explicit parallel block Cholesky algorithms on the CRAY APP
1995
Applied Numerical Mathematics
Furthermore, two different algorithms for Cholesky factorization are discussed: a block left-looking algorithm and a block right-looking algorithm. ...
In this paper we consider the CRAY APP, the Attached Parallel Processor of the CRAY S-MP, which consists of seven buses with each bus supporting up to 12 processing elements. ...
Acknowledgements The author wishes to thank Alan Stewart and Herman J.J. te Riele for their constructive comments. ...
doi:10.1016/0168-9274(95)00078-9
fatcat:nvvbjc4emrfvxo2y5y3xp7yb2y
The AT&TKorbx® System
1989
AT&T Technical Journal
Murray,AdrianKester, and Srinivas Sataluri for developing and enhancing the preprocessor and postprocessor; and Karen Medhi for improvements to the dual conjugate-gradient method. ...
We would also like to thank Narendra Karmarkar and RamRamakrishnan for developing prototypecode for both the primalaffine and dual conjugate-gradient methods and for other help they have given us. ...
(c) Cholesky factor L with original order-Ing; shows the nonzero structure below the diagonal of the Cholesky factor. ...
doi:10.1002/j.1538-7305.1989.tb00315.x
fatcat:b3dnnxa6tnc4tmsiey2swfszgi
The Matlab Radial Basis Function Toolbox
2017
Journal of Open Research Software
The Matlab Radial Basis Function toolbox features a regularization method for the ill-conditioned system, extended precision floating point arithmetic, and symmetry exploitation for the purpose of reducing ...
Radial Basis Function (RBF) methods are important tools for scattered data interpolation and for the solution of Partial Differential Equations in complexly shaped domains. ...
If a matrix is symmetric, Cholesky factorization is attempted. If Cholesky factorization fails, then the matrix is factorized with LU factorization. ...
doi:10.5334/jors.131
fatcat:n3ncl7nucreprgdu2yulwatp4a
Geostatistical Modeling and Prediction Using Mixed Precision Tile Cholesky Factorization
2019
2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC)
Cholesky, the standard algorithm, requires O(n 3 ) floating point operators and has an O(n 2 ) memory footprint, where n is the number of geographical locations. ...
Here, we present a mixed-precision tile algorithm to accelerate the Cholesky factorization during the log-likelihood function evaluation. ...
., and Intel Corp., the Cray Center of Excellence and Intel Parallel Computing Center awarded to the Extreme Computing Research Center (ECRC) at KAUST. ...
doi:10.1109/hipc.2019.00028
dblp:conf/hipc/AbdulahLSGK19
fatcat:6ifuuqt7nfgslkkth3bljoud6m
Geostatistical Modeling and Prediction Using Mixed-Precision Tile Cholesky Factorization
[article]
2020
arXiv
pre-print
Cholesky, the standard algorithm, requires O(n^3) floating point operators and has an O(n^2) memory footprint, where n is the number of geographical locations. ...
Here, we present a mixed-precision tile algorithm to accelerate the Cholesky factorization during the log-likelihood function evaluation. ...
., and Intel Corp., the Cray Center of Excellence and Intel Parallel Computing Center awarded to the Extreme Computing Research Center (ECRC) at KAUST. ...
arXiv:2003.05324v1
fatcat:f2qqlfv3evdx7jfmtj2yn3difm
Multigrid solution of the Poisson?Boltzmann equation
1993
Journal of Computational Chemistry
Our results indicate that the multigrid method is superior to the preconditioned CG methods and SOR, and that the advantage of multigrid grows with the problem size. ...
A detailed analysis of the resulting method is presented for several computer architectures, including comparisons to diagonally scaled CG, ICCG, vectorized ICCG and MICCG, and to SOR provided with an ...
Timings and Megaflop Rates Timings, operation counts, and megaflops (one million floating point operations per second) figures on the Cray Y-MP were obtained from the performance monitoring hardware accessed ...
doi:10.1002/jcc.540140114
fatcat:kq2jp4dg45fzrj2ndbwiefhqpu
« Previous
Showing results 1 — 15 out of 325 results