Filters








325 Hits in 4.6 sec

Y-MP FLOATING POINT AND CHOLESKY FACTORIZATION

RUSSELL CARTER
1991 International journal of high speed computing  
The size of the residual can be related to the size of the of A and : !  ...  Q 4 T " l d u G a 8 R T f T W ¦ Û r = b -A (compute residual) solve GG T e = r for e (using the previously computed factors) = + e § ¦ W ¦ u $ & ) W ¦ E X Y # $ u ¦ E F 3 t u G a V 3 t W ¤ d V & ) 3 H  ... 
doi:10.1142/s0129053391000103 fatcat:v3b7guqr2vhhrmhfn6ymhui64u

SOLVING LARGE SCALE LINEAR PROGRAMMING PROBLEMS USING AN INTERIOR POINT METHOD ON A MASSIVELY PARALLEL SIMD COMPUTER

HJÁLMTYÝR HAFSTEINSSON, RONI LEVKOVITZ, GAUTAM MITRA
1994 Parallel Algorithms and Applications  
At the heart of the implementation is a parallel Cholesky factorization algorithm for sparse matrices.  ...  Our implementation uses a new scheme of mapping the matrix onto the processor grid of the MasPar, that results in a more efficient Cholesky factorization than previously suggested schemes.  ...  Jones and Mr. C. Tong of the MRC Cyclotron Unit of Hammersmith Hospital for supplying the PET models and for working closely with us in the solution of those problems.  ... 
doi:10.1080/10637199408915470 fatcat:3opyqbllebenrbqwltncynnita

A Supernodal Cholesky Factorization Algorithm for Shared-Memory Multiprocessors

Esmond Ng, Barry W. Peyton
1993 SIAM Journal on Scientific Computing  
A parallel supernodal Cholesky factorization algorithm will be presented in Section 3 as well. Section 4 provides experimental results on an IBM RS/6000, a Cray Y-MP, and a Sequent Balance 8000.  ...  Table 4 . 2 . 42 2A single floating-point operation is either a floating-point addition or a floating-point multiplicatim, and is denoted by "flop". ............ -. .___ ___ -. ....... ...... yn-oblcm  ... 
doi:10.1137/0914048 fatcat:neyhsiudkngmfpkp4gfl3c5sau

Block Sparse Cholesky Algorithms on Advanced Uniprocessor Computers

Esmond G. Ng, Barry W. Peyton
1993 SIAM Journal on Scientific Computing  
Acknowledgments The authors thank the Minnesota Supercomputer Institute and Cray Research, Inc. for providing computer support.  ...  Also, part of the work was performed while the authors were visiting the Institute for Mathematics and its Applications at the University of Minnesota, which is funded principally by the National Science  ...  All algorithms were coded in Fortran and all floating-point operations were performed in double precision, except on the Cray Y-MP.  ... 
doi:10.1137/0914063 fatcat:e3zz4osnljeiloyfzifveeqe2u

Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices Using GPUs [chapter]

Ahmad Abdelfattah, Stan Tomov, Jack Dongarra
2020 Lecture Notes in Computer Science  
The solver is based on a mixed-precision Cholesky factorization that utilizes the high-performance tensor core units in CUDA-enabled GPUs.  ...  Since the Cholesky factors are affected by the low precision, an iterative refinement (IR) solver is required to recover the solution back to double-precision accuracy.  ...  Throughout the paper, we assume that b is an N × 1 vector, and so the triangular solve step requires O(N 2 ) floating-point operations (FLOPs).  ... 
doi:10.1007/978-3-030-50417-5_18 fatcat:mcx73t5pmzhb3n6rcccqoem5ja

A parallel-vector algorithm for rapid structural analysis on high-performance computers

T.K. Agarwal, O.O. Storaasli, D.T. Nguyen
1994 Computers & structures  
This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization.  ...  The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined.  ...  A record is kept of number of floating point operations performed by each processor to factor and solve the matrix (totf, tots) as well as the elapsed (et0-et5) and task CPU time (t0-t5) on each processor  ... 
doi:10.1016/0045-7949(94)90057-4 fatcat:mz26tgvznjbr3fmtfom5v6orly

Solution of large, dense symmetric generalized eigenvalue problems using secondary storage

Roger G. Grimes, Horst D. Simon
1988 ACM Transactions on Mathematical Software  
A combination of block Cholesky and block Householder transformations are used to reduce the problem to a symmetric banded eigenproblem whose eigenvalues can be computed in central memory.  ...  Here A is assumed to be symmetric, and B symmetric positive definite.  ...  ACKNOWLEDGMENT We would like to thank Dan Pierce for modifying TREEQL and for carrying out the related numerical experiments.  ... 
doi:10.1145/44128.44130 fatcat:x3mmxvycpjgejf3rvdmqirdtpm

Squeezing the most out of an algorithm in CRAY FORTRAN

Jack J. Dongarra, Stanley C. Eisenstat
1984 ACM Transactions on Mathematical Software  
The technique can be applied to a wide variety of algorithms m hnear algebra, and is beneficial m other architectural settings. 1MFLOPS is an acronym for million floating-point operations (adchtions or  ...  ACKNOWLEDGMENTS We would like to thank the National Magnetic Fusion Energy Computer Center for providing computer time to carry out some of the experiments, Cray Research for their cooperation, and Alan  ...  CFT has six vector memory references for each eight vector floating-point operations.  ... 
doi:10.1145/1271.319413 fatcat:nhce2bm655fbbpmzr6fdqykhxu

Early Experience With the Intel iPsc/860 At Oak Ridge National Laboratory

Michael T. Heath, George A. Geist, John B. Drake
1991 The International Journal of Supercomputing Applications  
and data formance, and 60 Mfiops 64-bit floating-point performance.  ...  M¢chine Processors Mfiops Comments DE( 3100 1 2 IBM RS/6000 1 18 Model 530 Cray-2 1 49 Cray Y-MP 1 130 Fortran only Cray Y-MP 1 203 assembler BLAS Cray Y-MP 8 1509 multitasking  ... 
doi:10.1177/109434209100500202 fatcat:als6nyzugrfzrhgrrgujkqfjhq

Explicit parallel block Cholesky algorithms on the CRAY APP

Margreet Nool
1995 Applied Numerical Mathematics  
Furthermore, two different algorithms for Cholesky factorization are discussed: a block left-looking algorithm and a block right-looking algorithm.  ...  In this paper we consider the CRAY APP, the Attached Parallel Processor of the CRAY S-MP, which consists of seven buses with each bus supporting up to 12 processing elements.  ...  Acknowledgements The author wishes to thank Alan Stewart and Herman J.J. te Riele for their constructive comments.  ... 
doi:10.1016/0168-9274(95)00078-9 fatcat:nvvbjc4emrfvxo2y5y3xp7yb2y

The AT&TKorbx® System

Yun-Chian Cheng, David J. Houck, Jun-Min Liu, Marc S. Meketon, Lev Slutsman, Robert J. Vanderbei, Pyng Wang
1989 AT&T Technical Journal  
Murray,AdrianKester, and Srinivas Sataluri for developing and enhancing the preprocessor and postprocessor; and Karen Medhi for improvements to the dual conjugate-gradient method.  ...  We would also like to thank Narendra Karmarkar and RamRamakrishnan for developing prototypecode for both the primalaffine and dual conjugate-gradient methods and for other help they have given us.  ...  (c) Cholesky factor L with original order-Ing; shows the nonzero structure below the diagonal of the Cholesky factor.  ... 
doi:10.1002/j.1538-7305.1989.tb00315.x fatcat:b3dnnxa6tnc4tmsiey2swfszgi

The Matlab Radial Basis Function Toolbox

Scott A. Sarra
2017 Journal of Open Research Software  
The Matlab Radial Basis Function toolbox features a regularization method for the ill-conditioned system, extended precision floating point arithmetic, and symmetry exploitation for the purpose of reducing  ...  Radial Basis Function (RBF) methods are important tools for scattered data interpolation and for the solution of Partial Differential Equations in complexly shaped domains.  ...  If a matrix is symmetric, Cholesky factorization is attempted. If Cholesky factorization fails, then the matrix is factorized with LU factorization.  ... 
doi:10.5334/jors.131 fatcat:n3ncl7nucreprgdu2yulwatp4a

Geostatistical Modeling and Prediction Using Mixed Precision Tile Cholesky Factorization

Sameh Abdulah, Hatem Ltaief, Ying Sun, Marc G. Genton, David E. Keyes
2019 2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC)  
Cholesky, the standard algorithm, requires O(n 3 ) floating point operators and has an O(n 2 ) memory footprint, where n is the number of geographical locations.  ...  Here, we present a mixed-precision tile algorithm to accelerate the Cholesky factorization during the log-likelihood function evaluation.  ...  ., and Intel Corp., the Cray Center of Excellence and Intel Parallel Computing Center awarded to the Extreme Computing Research Center (ECRC) at KAUST.  ... 
doi:10.1109/hipc.2019.00028 dblp:conf/hipc/AbdulahLSGK19 fatcat:6ifuuqt7nfgslkkth3bljoud6m

Geostatistical Modeling and Prediction Using Mixed-Precision Tile Cholesky Factorization [article]

Sameh Abdulah, Hatem Ltaief, Ying Sun, Marc G. Genton, David E. Keyes
2020 arXiv   pre-print
Cholesky, the standard algorithm, requires O(n^3) floating point operators and has an O(n^2) memory footprint, where n is the number of geographical locations.  ...  Here, we present a mixed-precision tile algorithm to accelerate the Cholesky factorization during the log-likelihood function evaluation.  ...  ., and Intel Corp., the Cray Center of Excellence and Intel Parallel Computing Center awarded to the Extreme Computing Research Center (ECRC) at KAUST.  ... 
arXiv:2003.05324v1 fatcat:f2qqlfv3evdx7jfmtj2yn3difm

Multigrid solution of the Poisson?Boltzmann equation

Michael Holst, Faisal Saied
1993 Journal of Computational Chemistry  
Our results indicate that the multigrid method is superior to the preconditioned CG methods and SOR, and that the advantage of multigrid grows with the problem size.  ...  A detailed analysis of the resulting method is presented for several computer architectures, including comparisons to diagonally scaled CG, ICCG, vectorized ICCG and MICCG, and to SOR provided with an  ...  Timings and Megaflop Rates Timings, operation counts, and megaflops (one million floating point operations per second) figures on the Cray Y-MP were obtained from the performance monitoring hardware accessed  ... 
doi:10.1002/jcc.540140114 fatcat:kq2jp4dg45fzrj2ndbwiefhqpu
« Previous Showing results 1 — 15 out of 325 results