Filters








1,968 Hits in 3.6 sec

Improved load distribution in parallel sparse cholesky factorization

Edward Rothberg, Robert Schreiber
1994 Supercomputing, Proceedings  
Compared to the customary column-oriented approaches, block-oriented, distributed-memory sparse Cholesky factorization benefits from an asymptotic reduction in interprocessor communication volume and an  ...  The result is a roughly 20_o increase in realized parallel factorization performance, as demonstrated by performance results from an Intel Paragon TM system.  ...  Finally, Sec- tion 5 discusses the results. 2 Parallel Sparse Cholesky Faetoriza- tion 2.1 Computation Structure The goal of the sparse Cholesky computation is to factor a sparse symmetric  ... 
doi:10.1145/602896.602897 fatcat:onxkoeq25jcxdovzqmxfxlsaui

Improved load distribution in parallel sparse cholesky factorization

Edward Rothberg, Robert Schreiber
1994 Supercomputing, Proceedings  
Compared to the customary column-oriented approaches, block-oriented, distributed-memory sparse Cholesky factorization benefits from an asymptotic reduction in interprocessor communication volume and an  ...  The result is a roughly 20_o increase in realized parallel factorization performance, as demonstrated by performance results from an Intel Paragon TM system.  ...  Finally, Sec- tion 5 discusses the results. 2 Parallel Sparse Cholesky Faetoriza- tion 2.1 Computation Structure The goal of the sparse Cholesky computation is to factor a sparse symmetric  ... 
doi:10.1145/602770.602897 fatcat:fowuejtmkrgghf67mt7dlvo674

Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization

Heejo Lee, Jong Kim, Sung Je Hong, Sunggu Lee
2003 Parallel Computing  
In this paper, we represent tasks using a block dependency DAG that represents the execution behavior of block sparse Cholesky factorization in a distributed-memory system.  ...  Block-oriented sparse Cholesky factorization decomposes a sparse matrix into rectangular subblocks; each block can then be handled as a computational unit in order to increase data reuse in a hierarchical  ...  In addition, he provided us with his technical reports and the SPOOLES library, which is used for ordering and supernode amalgamation.  ... 
doi:10.1016/s0167-8191(02)00220-x fatcat:joqgju7aabasfpdcqh3qbvwraq

Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization

Heejo Lee, Jong Kim, Sung Je Hong, Sunggu Lee
2000 Proceedings of the 2000 ACM symposium on Applied computing - SAC '00  
In this paper, we represent tasks using a block dependency DAG that represents the execution behavior of block sparse Cholesky factorization in a distributed-memory system.  ...  Block-oriented sparse Cholesky factorization decomposes a sparse matrix into rectangular subblocks; each block can then be handled as a computational unit in order to increase data reuse in a hierarchical  ...  In addition, he provided us with his technical reports and the SPOOLES library, which is used for ordering and supernode amalgamation.  ... 
doi:10.1145/338407.338535 dblp:conf/sac/LeeKHL00 fatcat:nlkqqn64sng3tnhgmzgnve2rjy

Scalability of Sparse Direct Solvers [chapter]

Robert Schreiber
1993 IMA Volumes in Mathematics and its Applications  
In this paper we s h o w that the column-oriented approach to sparse Cholesky for distributed-memory machines is not scalable.  ...  )) must grow as P 2 in order to maintain parallel e ciency bounded above zero.  ...  In this paper we investigate the scalability of these classes of methods for distributed sparse Cholesky factorization.  ... 
doi:10.1007/978-1-4613-8369-7_9 fatcat:cukmnvamejhj5a54l4mu2svuxq

Highly scalable parallel algorithms for sparse matrix factorization

A. Gupta, G. Karypis, V. Kumar
1997 IEEE Transactions on Parallel and Distributed Systems  
Through our analysis and experimental results, we demonstrate that our algorithm substantially improves the state of the art in parallel direct solution of sparse linear systems-both in terms of scalability  ...  Although, in this paper, we discuss Cholesky factorization of symmetric positive definite matrices, the algorithms can be adapted for solving sparse linear least squares problems and for Gaussian elimination  ...  We have developed highly scalable formulations of sparse Cholesky factorization that substantially improve the state of the art in parallel direct solution of sparse linear systems-both in terms of scalability  ... 
doi:10.1109/71.598277 fatcat:pwnnwungxbcavfi6imtrj7xv4q

An Efficient Block-Oriented Approach to Parallel Sparse Cholesky Factorization

Edward Rothberg, Anoop Gupta
1994 SIAM Journal on Scientific Computing  
This paper explores the use o f a sub-block decomposition strategy for parallel sparse Cholesky factorization, in which the sparse matrix is decomposed into rectangular blocks.  ...  However. little progress has been made in producing a practical sub-block method.  ...  Acknowledgments We would like to thank Rob Schreiber and Sid Chatterjee for their discussions on block-oriented factorization. This research is supported under DARPA contract N00039-91-C-0138.  ... 
doi:10.1137/0915085 fatcat:47rvici4qrdbrkb3un5yomxqgm

Scaling performance of interior-point method on large-scale chip multiprocessor system

Mikhail Smelyanskiy, Victor W Lee, Daehyun Kim, Anthony D Nguyen, Pradeep Dubey
2007 Proceedings of the 2007 ACM/IEEE conference on Supercomputing - SC '07  
While each of these kernels contains a large amount of parallelism, sparse irregular datasets seen in many optimization problems make parallelism difficult to exploit.  ...  IPM spends most of its computation time in a few sparse linear algebra kernels.  ...  Acknowledgments We would like to thank Sanjeev Kumar who provided the software and hardware implementations of task queue library in our simulator.  ... 
doi:10.1145/1362622.1362652 dblp:conf/sc/SmelyanskiyLKND07 fatcat:lscmlcqis5hpfknc3dgy3invv4

Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout [article]

Kyungjoo Kim, Sivasankaran Rajamanickam, George Stelle, H. Carter Edwards, Stephen L. Olivier
2016 arXiv   pre-print
We introduce a task-parallel algorithm for sparse incomplete Cholesky factorization that utilizes a 2D sparse partitioned-block layout of a matrix.  ...  These tasks are inter-related to each other through their data dependences in the factorization algorithm.  ...  As a result, our task-parallel Cholesky factorization has the same look-and-feel as the scalar Cholesky factorization, greatly improving programmability. We demonstrate this later in Section 3.3.  ... 
arXiv:1601.05871v1 fatcat:d4bhjutlfvgclllcv24hsmzn7a

Multi-pass Mapping Schemes for Parallel Sparse Matrix Computations [chapter]

Konrad Malkowski, Padma Raghavan
2005 Lecture Notes in Computer Science  
Consider the solution of a large sparse linear system Ax = b on multiprocessors. A parallel sparse matrix factorization is required in a direct solver.  ...  Alternatively, if Krylov subspace iterative methods are used, then incomplete forms of parallel sparse factorization are required for preconditioning.  ...  Empirical Results In this section, we empirically evaluate the quality of assignments for performing parallel sparse Cholesky factorization.  ... 
doi:10.1007/11428831_31 fatcat:26p5hsm6vfhrvco63ez5qpuvoy

A parallel formulation of interior point algorithms

George Karypis, Anshul Gupta, Vipin Kumar
1994 Supercomputing, Proceedings  
The performance of parallel Cholesky factorization is determined by a the communication overhead incurred by the algorithm, and b the load imbalance among the processors.  ...  In our parallel interior point algorithm, we use our recently developed parallel multifrontal algorithm that has the smallest communication overhead over all parallel algorithms for Cholesky factorization  ...  Jonathan Eckstein for his guidance in interior point methods. We are also grateful to Dr.  ... 
doi:10.1145/602770.602808 fatcat:joxeq6lb7fh45jefcwyhdskrua

A parallel formulation of interior point algorithms

George Karypis, Anshul Gupta, Vipin Kumar
1994 Supercomputing, Proceedings  
The performance of parallel Cholesky factorization is determined by a the communication overhead incurred by the algorithm, and b the load imbalance among the processors.  ...  In our parallel interior point algorithm, we use our recently developed parallel multifrontal algorithm that has the smallest communication overhead over all parallel algorithms for Cholesky factorization  ...  Jonathan Eckstein for his guidance in interior point methods. We are also grateful to Dr.  ... 
doi:10.1145/602805.602808 fatcat:23vhjjrnfrfe7lx4khfmtnu5ja

Run-time compilation for parallel sparse matrix computations

Cong Fu, Tao Yang
1996 Proceedings of the 10th international conference on Supercomputing - ICS '96  
We discuss a set of opt imizat ion st rat egies used in this system and demonstrate the application of this system in parallelizing sparse Cholesky and LU fact orizat ions.  ...  However, it is still an open problem to efficiently parallelize sparse matrix factorization commonly used in iterative numerical problems.  ...  Figure 12 : 12 Figure 12: Distributions of the communication overheads for-sparse Cholesky factorization on BCSSTK15. laFigure 13 : 13 Figure 13: Distributions of the block sizes. .  ... 
doi:10.1145/237578.237609 dblp:conf/ics/FuY96 fatcat:dhcqx63bavfoxdu7arfxjgdzkq

Scalable Sparse Matrix Techniques for Modeling Crack Growth [chapter]

P. Raghavan, M. A. James, J. C. Newman, B. R. Seshadri
2002 Lecture Notes in Computer Science  
Our initial results show that on average, our scheme speeds the Cholesky factor update step by a factor of 43:4. Our analytic results indicate that our method is be both e cient and scalable.  ...  Recent research (see survey in 4]) has resulted in algorithms for a scalable, parallel implementation of sparse Cholesky factorization.  ...  The two schemes di er in the data-movement and in the amount of temporary storage during factorization. Parallel sparse Cholesky with P processors utilizes both task and data parallelism.  ... 
doi:10.1007/3-540-48051-x_58 fatcat:gc2qg5qp6nfkpcc7olmgvemmg4

Solution of finite element problems using hybrid parallelization with MPI and OpenMP

José Miguel Vargas-Félix, Salvador Botello-Rionda
2012 Acta Universitaria  
This method allows parallelization. MPI (Message Passing Interface) is used to distribute the systems of equations to solve each one in a computer of a cluster.  ...  Each system of equations is solved using a solver implemented to use OpenMP as a local parallelization method.  ...  Also Cholesky factorization could be implemented in parallel.  ... 
doi:10.15174/au.2012.391 fatcat:k55hx5tcpvgdpd3sxj5qdma7hi
« Previous Showing results 1 — 15 out of 1,968 results