A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Large-scale Simulations of 3D Groundwater Flow Using Parallel Geometric Multigrid Method
2013
Procedia Computer Science
In the present work, the effect of sparse matrix storage formats on the performance of parallel geometric multigrid solvers was evaluated, and a new data structure for the Ellpack-Itpack (ELL) format is ...
The proposed method is implemented for pGW3D-FVM, a parallel code for 3D groundwater flow simulations using the multigrid method, and the robustness and performance of the code was evaluated on up to 4,096 ...
Acknowledgements This work is supported by Core Research for Evolutional Science and Technology (CREST), the Japan Science and Technology Agency (JST), Japan. ...
doi:10.1016/j.procs.2013.05.293
fatcat:5lbwhvqvwfbuzbdc6xwmoj6ogy
Parallel spectroscopic imaging reconstruction with arbitrary trajectories using k-space sparse matrices
2009
Magnetic Resonance in Medicine
Alternative methods for reconstruction with undersampled Cartesian k-space data are the SMASH and GRAPPA algorithms that do the reconstruction in the k-space domain. ...
For undersampled k-space data on a Cartesian grid, the sensitivity encoding (SENSE) algorithm (5) can be applied to each spectral point in image domain after performing the Fourier transform to obtain ...
Grant sponsor: NIH; grant numbers: AG18942, CA 48269, and RR 09784mtbn. ...
doi:10.1002/mrm.21838
pmid:19165883
pmcid:PMC2754750
fatcat:ug6grszegjfslkac4ouidu6vvq
Heuristic Adaptability to Input Dynamics for SpMM on GPUs
[article]
2022
arXiv
pre-print
Orthogonal design principles for such a sparse problem should be extracted to form different algorithms, and further used for performance tuning. (2) Nontrivial implementations in the algorithm space. ...
Many previous studies exploit GPUs for SpMM acceleration because GPUs provide high bandwidth and parallelism. ...
Consider the first loop, 𝑀-Loop, of SpMM shown in Figure 2 (a). 𝑀-Loop iterates over the rows of the sparse matrix 𝐴. ...
arXiv:2202.08556v1
fatcat:eevj2z76kfexzheqmidsnigae4
Parallel implementation of recursive spectral bisection on the connection machine CM-5 system
[chapter]
1995
Parallel Computational Fluid Dynamics 1993
ACKNOWLEDGMENTS We would like to thank Arthur Raefsky (CENTRIC Engineering Systems), Horst Simon (NASA Ames), John Kennedy, Jacek Myczkowski and Richard Shapiro (Thinking Machines) for their helpful comments ...
Three dot-product operations for each partition; 2. One matrix-vector product of the form u = L v; and 3. ...
Elapsed times for di erent parts of the RSB algorithm for a partitioning into 256 subdomains on a 64-node CM-5 system.Table 2. M6 wing. Cost analysis for the computation of the Fiedler vector. ...
doi:10.1016/b978-044481999-4/50178-x
fatcat:u44r26vgnvh6himuoaffxozyai
Bpop: an efficient program for estimating base population allele frequencies in single and multiple group structured populations
2020
Agricultural and Food Science
Thus, the IOP and IM approaches can be recommended for large data sets because of their low memory use and computing time. ...
The required dense matrix products involving (A22 )-1v were implemented efficiently using sparse submatrices of A-1, where A and A22 are pedigree relationship matrices for all and genotyped animals, respectively ...
Acknowledgements The authors acknowledge the Irish Cattle Breeding Federation (ICBF) for providing the beef cattle data. ...
doi:10.23986/afsci.90955
fatcat:oxz64p6xirhqrcnrpqach6wngy
On the Fine-Grain Decomposition of Multicommodity Transportation Problems
1991
SIAM Journal on Optimization
Implementations on the Connection Machine CM-2 are discussed for both dense and sparse transportation problems. The dense implementation achieves computing rate of 1.6-3 GFLOPS. ...
Hence, a fine-grain decomposition scheme is developed that is suitable for massively parallel computer architectures of the SIMD (i.e., single instruction stream, multiple data stream) class. ...
Acknowledgements: Professor Yair Censor deserves honorary co-authorship on this paper for numerous illuminating discussions on row-action algorithms and his constant encouragement. ...
doi:10.1137/0801038
fatcat:in4zj3rmj5ci5o3253olljf6ha
Quark propagator on the Connection Machine
1992
Parallel Computing
Schilling and N. Petkov, Quark propagator on the Connection Machine, Parallel Computing 18 (1992) 1291-1299. ...
We discuss and compare the structure, implementation and performance of two linear equation solvers, the Jacobi algorithm and the Conjugate Gradient algorithm, on the Connection Machine CM-2. ...
The 'Wilson fermion' matrix in Eq. 1 is then given ; )U~ (x IZ)6,.,+~,).
Fig. 2 . 2 (a) Time per iteration for Jacobi and Conjugate Gradient on the 8 4 lattice. ...
doi:10.1016/0167-8191(92)90120-v
fatcat:2zndav46vrb4vfwk4ycdsgetem
Parallel GPF solution: A GPU‐CPU‐based vectorization parallelization and sparse technique for NR implementation
2021
IET Renewable Power Generation
Considering the studies carried out in the authors' previous work of the application of GPU and sparse techniques on TS power flow (TS-PF), its applications on parallel implementations of corresponding ...
However, sequential GPF lacks timely solution performance required in several applications built on top of GPF, for example, contingency analysis. ...
and multi-type flexible power transmission devices, No. ...
doi:10.1049/rpg2.12316
fatcat:nntnlebhhffhzofjd5ydyko4fq
Model Order Reduction of Large-Scale Finite Element Systems in an MPI Parallelized Environment for Usage in Multibody Simulation
2016
Archive of Mechanical Engineering
Besides, an iterative solver is considered within the CMS-based method. ...
Due to the always increasing size of the non-reduced systems, the calculation of the projection matrix leads to a large demand of computational resources and cannot be done on usual serial computers with ...
Except for the CMS based method, only direct methods can currently be used in Morembs++ due to the properties of the coefficient matrix. ...
doi:10.1515/meceng-2016-0027
fatcat:voafy32znnbbjle4ivhrklscmq
3D Alternating Direction TV-Based Cone-Beam CT Reconstruction with Efficient GPU Implementation
2014
Computational and Mathematical Methods in Medicine
The iteration for this algorithm is simple but convergent. The simulation and real CT data reconstruction results indicate that the proposed algorithm is both fast and accurate. ...
The applied proximal technique avoids the horrible pseudoinverse computation of big matrix which makes the proposed algorithm applicable and efficient for CBCT imaging. ...
Introduction Recently, iterative image reconstruction (IIR) algorithms [1] [2] [3] [4] [5] [6] , especially compressive sensing (CS) [7] [8] [9] [10] based ones, have been developed for X-ray computed ...
doi:10.1155/2014/982695
pmid:25045400
pmcid:PMC4089849
fatcat:q3aooknkszhnjkmicbjioqwada
Page 6251 of Mathematical Reviews Vol. , Issue 2003h
[page]
2003
Mathematical Reviews
“These algorithms are based on our earlier work on computing row and column counts for sparse Cholesky factorization, plus an efficient method to compute the column elimination tree of a sparse matrix ...
W. (1-ORNL-CM; Oak Ridge, TN
Computing row and column counts for sparse OR and LU factorization. (English summary)
BIT 41 (2001), no. 4, 693-710. ...
Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards
[article]
2015
arXiv
pre-print
We discuss an approach for solving sparse or dense banded linear systems A x = b on a Graphics Processing Unit (GPU) card. ...
sparse direct solvers: PARDISO, SuperLU, and MUMPS. ...
Solver comparisons raw data. For completeness, we provide here the raw comparison data for the tested solvers which was used in generating the figures and plots in the paper. ...
arXiv:1509.07919v1
fatcat:z6434bxvmrcq3emoxzxvm54gcu
Page 7619 of Mathematical Reviews Vol. , Issue 95m
[page]
1995
Mathematical Reviews
Our computational results on a CM-2 demonstrate the potential supe- riority of the partitioned inverse approach over the conventional substitution algorithm for highly parallel sparse triangular solu- ...
Summary: “The development of efficient, general-purpose soft-
ware for the iterative solution of sparse linear systems on parallel
MIMD computers depends on recent results from a wide variety of
research ...
Noniterative MAP Reconstruction Using Sparse Matrix Representations
2009
IEEE Transactions on Image Processing
We present a method for noniterative maximum a posteriori (MAP) tomographic reconstruction which is based on the use of sparse matrix representations. ...
The second idea is a method for efficiently storing and computing the required orthonormal transformations, which we call a sparse-matrix transform (SMT). ...
Reconstructed images of (r) at z = 3 cm using the compressed H matrix based on KL transform (used both for data whitening and matrix decorrelation). ...
doi:10.1109/tip.2009.2023724
pmid:19556196
fatcat:oblllbiwebctdgty65pk3fp5yq
Accelerated Spatially Resolved Electrical Simulation of Photovoltaic Devices Using Photovoltaic-Oriented Nodal Analysis
2015
IEEE Transactions on Electron Devices
Sharpe from CREST, Loughborough University, for the fruitful discussions on sparse matrix techniques. ...
Iterative methods can have lower time complexity and thus higher efficiency than direct ones. Being row independent [28] , they are suitable to combine sparse matrix techniques. ...
Parallel computing is an appropriate candidate for accelerating a program with intensive matrix computations. ...
doi:10.1109/ted.2015.2409058
fatcat:tfjxhgosnbgjbjj3o2xyd65idq
« Previous
Showing results 1 — 15 out of 14,244 results