2,215 Hits in 6.4 sec

The roles of sparse direct methods in large-scale simulations

X S Li, W Gao, P J R Husbands, C Yang, E G Ng
2005 Journal of Physics, Conference Series  
Overview of SuperLU SuperLU is a leading scalable solver for sparse linear systems using direct methods, which is partly funded through the TOPS SciDAC project (led by David Keyes) [11] .  ...  Sparse systems of linear equations and eigen-equations arise at the heart of many large-scale, vital simulations in DOE.  ...  The solver can easily work with any other reordering algorithms not implemented in SuperLU.  ... 
doi:10.1088/1742-6596/16/1/065 fatcat:ujeo2vnbdfgbjhzzrmxhyvo3te

Efficient Implementation of Gaussian Belief Propagation Solver for Large Sparse Diagonally Dominant Linear Systems

Yousef El-Kurdi, Warren J. Gross, Dennis Giannacopoulos
2012 IEEE transactions on magnetics  
Also we present a new flexible scheduling scheme of the algorithm that is aimed for implementation on parallel architectures by reducing the iteration count of parallel GaBP and achieving better hardware  ...  Compared to the diagonally-preconditioned conjugate gradient method, our algorithm demonstrates empirical improvements up to 6 in iteration count and speedups up to 1 8 in execution time.  ...  A newly introduced iterative method, shown in [2] , uses BP over Gaussian graphical models (GaBP) as a solver for linear system of equations.  ... 
doi:10.1109/tmag.2011.2176318 fatcat:4czltpkvj5fazguomgsdny6qdm

Sparsifying Synchronization for High-Performance Shared-Memory Sparse Triangular Solver [chapter]

Jongsoo Park, Mikhail Smelyanskiy, Narayanan Sundaram, Pradeep Dubey
2014 Lecture Notes in Computer Science  
It is widely used in several types of sparse linear solvers, and it is commonly considered challenging to parallelize and scale even on a moderate number of cores.  ...  This challenge is due to the fact that triangular solver typically has limited task-level parallelism and relies on fine-grain synchronization to exploit this parallelism, compared to data-parallel operations  ...  Scott, Nadathur Satish, and Alexander A. Kalinkin for discussion during the initial stage of our project. We also thank Michael A.  ... 
doi:10.1007/978-3-319-07518-1_8 fatcat:z3lzntgb6bh27i3lo2lzmojox4

Algorithmic optimizations of a conjugate gradient solver on shared memory architectures

Henrik Löf, Jarmo Rantakokko
2006 International Journal of Parallel, Emergent and Distributed Systems  
Here proper data distribution and algorithmic optimizations play a vital role for performance.  ...  On a NUMA system the performance is significantly improved with the algorithmic optimizations leaving the system dependent global reduction operations as a bottleneck.  ...  An example of an important scientific application exhibiting an unstructured pattern of communication is an iterative solver for large sparse systems of equations.  ... 
doi:10.1080/17445760600568139 fatcat:c2ts7yudq5etdbczk2k6fyjgf4

Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices

Jongsoo Park, Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Alexander Heinecke, Dhiraj D. Kalamkar, Xing Liu, Md. Mosotofa Ali Patwary, Yutong Lu, Pradeep Dubey
2014 SC14: International Conference for High Performance Computing, Networking, Storage and Analysis  
While it is a wellknown challenge to efficiently parallelize Gauss-Seidel smoother, the most time-consuming kernel in HPCG, our algorithmic and architecture-aware optimizations deliver 95% and 68% of the  ...  A new sparse high performance conjugate gradient benchmark (HPCG) has been recently released to address challenges in the design of sparse linear solvers for the next generation extreme-scale computing  ...  Although its results are not presented, Stampede was used as a test bed, and we thank Carlos Rosales-Fernandez and the other staff at TACC, Khaled Hamidouche and the other MVAPICH team members at OSU,  ... 
doi:10.1109/sc.2014.82 dblp:conf/sc/ParkSVHKLPLD14 fatcat:ktiisywie5hhznon5qc2tuydoa

Page 6173 of Mathematical Reviews Vol. , Issue 95j [page]

1995 Mathematical Reviews  
A locally optimized reordering algorithm and its application to a parallel sparse linear system solver. (English and German summaries) Computing 54 (1995), no. 1, 39-67.  ...  Summary: “A coarse-grain parallel solver for systems of linear al- gebraic equations with general sparse matrices by Gaussian elim- ination is discussed.  ... 

Structure-adaptive parallel solution of sparse triangular linear systems

Ehsan Totoni, Michael T. Heath, Laxmikant V. Kale
2014 Parallel Computing  
Solution of sparse triangular systems of linear equations is a performance bottleneck in many methods for solving more general sparse systems.  ...  We describe the implementation of our algorithm in Charm++ and MPI and present promising results on up to 512 cores of BlueGene/P, using numerous sparse matrices from real applications.  ...  CONCLUSIONS AND FUTURE WORK Parallel solution of sparse triangular linear systems is an important kernel for many numerical methods used in applications.  ... 
doi:10.1016/j.parco.2014.06.006 fatcat:emxx7wcn4vceflpomruuw7mw4m

Reordering Strategy for Blocking Optimization in Sparse Linear Solvers

Gregoire Pichon, Mathieu Faverge, Pierre Ramet, Jean Roman
2017 SIAM Journal on Matrix Analysis and Applications  
Solving sparse linear systems is a problem that arises in many scientific applications, and sparse direct solvers are a time-consuming and key kernel for those applications and for more advanced solvers  ...  The preprocessing steps of sparse direct solvers-ordering and block-symbolic factorization-are two major steps that lead to a reduced amount of computation and memory and to a better task granularity to  ...  Many scientific applications, such as electromagnetism, astrophysics, and computational fluid dynamics, use numerical models that require solving linear systems of the form Ax = b.  ... 
doi:10.1137/16m1062454 fatcat:nqepemeio5f3dd3oqc7l6pmtrm

Increasing the Locality of Iterative Methods and Its Application to the Simulation of Semiconductor Devices

J.C. Pichel, D.B. Heras, J.C. Cabaleiro, A.J. García-Loureiro, F.F. Rivera
2009 The international journal of high performance computing applications  
In these simulations the solution of large sparse linear equation systems is required, which are often solved using iterative methods.  ...  In this paper a technique for improving the locality of sparse matrix codes is presented.  ...  Acknowledgments This work has been partially funded by project TIN2007-67537-C03-01 of Spanish Ministry of Education and Science (MEC).  ... 
doi:10.1177/1094342009106416 fatcat:ilvouiq345by3ioo4245ircqy4

Implementing the conjugate gradient algorithm on multi-core systems

W.A. Wiggers, V. Bakker, A.B.J. Kokkeler, G.J.M. Smit
2007 2007 International Symposium on System-on-Chip  
In linear solvers, like the conjugate gradient algorithm, sparse-matrix vector multiplication is an important kernel. Due to the sparseness of the matrices, the solver runs relatively slow.  ...  For digital optical tomography (DOT), a large set of linear equations have to be solved which currently takes in the order of hours on desktop computers.  ...  Our DOT algorithm requires multiple linear equations to be solved. Multiple threads are used to parallelize the execution of multiple solvers over the two processors.  ... 
doi:10.1109/issoc.2007.4427436 dblp:conf/issoc/WiggersBKS07 fatcat:y7sfefqmc5dnjkbgqtp6q2a7ji

Using PETSc to develop scalable applications for next-generation power grid

Shrirang Abhyankar, Barry Smith, Hong Zhang, Alexander Flueck
2011 Proceedings of the first international workshop on High performance computing, networking and analytics for the power grid - HiPCNA-PG '11  
have highly optimized implementations, and a wide array of tested numerical solvers.  ...  Developing scalable software for existing and emerging power system problems is a challenging task and requires a great deal of concentrated time and effort.  ...  As applications for the next generation power grid are developed there needs to be a benchmarking of various parallel algorithms on different system topologies to select the optimal or a set of optimal  ... 
doi:10.1145/2096123.2096138 fatcat:vvvhz62xovgyndgsswwixkq5s4

A parallel block frontal solver for large scale process simulation: Reordering effects

J.U. Mallya, S.E. Zitney, S. Choudhary, M.A. Stadtherr
1997 Computers and Chemical Engineering  
For the simulation and optimization of large-scale chemical processes, the overall computing time is often dominated by the time needed to solve a large sparse system of linear equations.  ...  We describe here a parallel frontal solver which can significantly reduce the wallclock time required to solve these linear equation systems using parallel/vector supercomputers.  ...  Kirk Abbott for providing the ASCEND matrices and the tear drop reorderings.  ... 
doi:10.1016/s0098-1354(97)87541-2 fatcat:li5rmzajtzbjfpcwlavayovrau

A Parallel Block Frontal Solver For Large Scale Process Simulation: Reordering Effects

J Mallya
1997 Computers and Chemical Engineering  
For the simulation and optimization of large-scale chemical processes, the overall computing time is often dominated by the time needed to solve a large sparse system of linear equations.  ...  We describe here a parallel frontal solver which can significantly reduce the wallclock time required to solve these linear equation systems using parallel/vector supercomputers.  ...  Kirk Abbott for providing the ASCEND matrices and the tear drop reorderings.  ... 
doi:10.1016/s0098-1354(97)00088-4 fatcat:g6af75duf5a4vomgd33tep2kke

7. Combinatorial Parallel and Scientific Computing [chapter]

Ali Pınar, Bruce Hendrickson
2006 Parallel Processing for Scientific Computing  
Combinatorial algorithms have long played a pivotal enabling role in many applications of parallel computing.  ...  Graph algorithms in particular arise in load balancing, scheduling, mapping and many other aspects of the parallelization of irregular applications.  ...  Acknowledgements We are grateful to Srinivas Aluru, David  ... 
doi:10.1137/1.9780898718133.ch7 fatcat:szr5t7calne2fbhkbkvwccrj6y

Exploiting locality for irregular scientific codes

Hwansoo Han, Chau-Wen Tseng
2006 IEEE Transactions on Parallel and Distributed Systems  
Experiments on irregular scientific codes for a variety of meshes show locality optimization techniques are effective for both sequential and parallelized codes, improving performance by 60-87 percent.  ...  Applied to dense inputs, Z-SORT achieves performance close to data reordering combined with other computation reordering but without the overhead involved in data reordering.  ...  Locality optimizations have also been developed in the context of sparse linear algebra.  ... 
doi:10.1109/tpds.2006.88 fatcat:5tijebdzgrefzdcj3jghapid6u
« Previous Showing results 1 — 15 out of 2,215 results