A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
3D Localization for Light-Field Microscopy via Convolutional Sparse Coding on Epipolar Images
2020
IEEE Transactions on Computational Imaging
In this paper, we propose a new 3D localization approach to effectively detect 3D positions of neuronal cells from a single light-field image with high accuracy and outstanding robustness to light scattering ...
Extensive experiments demonstrate that our approach can reliably detect the 3D positions of granular targets with small Root Mean Square Error (RMSE), high robustness to optical aberration and light scattering ...
and thereby offering a path toward efficient 3D localization. ...
doi:10.1109/tci.2020.2997301
pmid:32851121
pmcid:PMC7442043
fatcat:jmptasrwb5a5ni2mssdln2x3o4
Communication-Avoiding and Memory-Constrained Sparse Matrix-Matrix Multiplication at Extreme Scale
2021
2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Sparse matrix-matrix multiplication (SpGEMM) is a widely used kernel in various graph, scientific computing and machine learning algorithms. ...
Distributed SpGEMM at this extreme scale faces two key challenges: (1) high communication cost and (2) inadequate memory to generate the output. ...
Parallel efficiency. We compute the parallel efficiency by using P 1 P 2 T (P 1) (P 2) where T (P ) denotes the runtime with P processes, and P 2>P 1. ...
doi:10.1109/ipdps49936.2021.00018
fatcat:hsqxxkxqdbakbcem3ssab7zm2u
Communication-Avoiding and Memory-Constrained Sparse Matrix-Matrix Multiplication at Extreme Scale
[article]
2020
arXiv
pre-print
Sparse matrix-matrix multiplication (SpGEMM) is a widely used kernel in various graph, scientific computing and machine learning algorithms. ...
Distributed SpGEMM at this extreme scale faces two key challenges: (1) high communication cost and (2) inadequate memory to generate the output. ...
Parallel efficiency. We compute the parallel efficiency by using P 1 P 2 T (P 1) T (P 2) where T (P ) denotes the runtime with P processes, and P 2>P 1. ...
arXiv:2010.08526v1
fatcat:vf6csmuuerganoqulk23x56cnq
Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training
[article]
2021
arXiv
pre-print
parallelism, multiple tensor parallelism, and sequence parallelism. ...
The documentations can be found at https://www.colossalai.org and the source code can be found at https://github.com/hpcaitech/ColossalAI. ...
To address this problem, 2D, 2.5D and 3D tensor parallelism were proposed to fully eliminate memory redundancy. 2D Tensor Parallelism This method (Xu et al. 2021 ) relies on the SUMMA matrix multiplication ...
arXiv:2110.14883v1
fatcat:vgoi2r4byjfpddnq7gvmur7gju
A massively parallel tensor contraction framework for coupled-cluster computations
2014
Journal of Parallel and Distributed Computing
Each contraction may be executed via matrix multiplication on a properly ordered and structured tensor. However, data transpositions are often needed to reorder the tensors for each contraction. ...
Our CCSD and CCSDT implementations achieve high parallel scalability on the BlueGene/Q and Cray XC30 supercomputer architectures showing that accurate electronic structure calculations can be effectively ...
This library employs similar matrix multiplication primitives (SUMMA and 3D algorithms) for distributed tensor contractions and mapping of data onto torus networks. ...
doi:10.1016/j.jpdc.2014.06.002
fatcat:76at7oi2vfhbxfc6tmbzoe2xyy
Locality-aware parallel block-sparse matrix-matrix multiplication using the Chunks and Tasks programming model
2016
Parallel Computing
We present a method for parallel block-sparse matrix-matrix multiplication on distributed memory clusters. ...
A distributed quadtree matrix representation is straightforward to implement due to our recent development of the Chunks and Tasks programming model [Parallel Comput. 40, 328 (2014)]. ...
Acknowledgements Support from the Göran Gustafsson foundation, the Swedish research council (grant no. 623-2009-803 and 621-2012-3861), the Lisa and Carl-Gustav Esseen foundation, and the Swedish national ...
doi:10.1016/j.parco.2016.06.005
fatcat:tleeeqbunjh43pyuzcqqufln24
Reducing Communication Costs for Sparse Matrix Multiplication within Algebraic Multigrid
2016
SIAM Journal on Scientific Computing
In particular, we show that the most commonly used parallel algorithm is often not the most communication-efficient one for all of the matrix-matrix multiplications involved. ...
In this paper, we show that the most commonly used parallel SpMM algorithm is often not the most communication-efficient one for all of the matrix multiplications involved. ...
[4] consider multiple algorithms, classifying them into 1D (described in Section 4), 2D (which include Sparse SUMMA and Sparse Cannon) , and 3D varieties. ...
doi:10.1137/15m1028807
fatcat:idsejyuelnbvrjndf7lv3geeda
The parallelism motifs of genomic data analysis
2020
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
These applications differ from scientific simulations that dominate the workload on high-end parallel systems today and place different requirements on programming support, software libraries and parallel ...
Enormous community databases store and share these data with the research community, but some of these genomic data analysis problems require large-scale computational platforms to meet both the memory ...
[60] , and GNNs are bottlenecked with large sparse matrix-dense matrix multiplications [61] . ...
doi:10.1098/rsta.2019.0394
pmid:31955674
fatcat:kzujmq5u2refvhoovtb2ap5vha
DASH: Distributed Data Structures and Parallel Algorithms in a Global Address Space
[chapter]
2020
Lecture Notes in Computational Science and Engineering
DASH is a new programming approach offering distributed data structures and parallel algorithms in the form of a C++ template library. ...
We also present a performance and productivity study where we compare DASH with a set of established parallel programming models. ...
We would also like to thank the German research foundation (DFG) for the funding received through the SPPEXA priority programme and initiators and managers of SPPEXA for their foresight and level-headed ...
doi:10.1007/978-3-030-47956-5_6
fatcat:44avzbgnkvh73iriqceboti4wu
Matrix-free construction of HSS representation using adaptive randomized sampling
[article]
2018
arXiv
pre-print
We discuss parallel implementation and computation and communication cost of both variants. ...
Parallel numerical results for a range of applications, including boundary element method matrices and quantum chemistry Toeplitz matrices, show the effectiveness, scalability and numerical robustness ...
We thank Daniel Haxton (LBNL) and Jeremiah Jones (Arizona State University) for providing us with the Quantum Chemistry test problem. ...
arXiv:1810.04125v2
fatcat:w2qklyn7rvb6fliishcmqzqb3i
Design and Implementation of the PULSAR Programming System for Large Scale Computing
2017
Supercomputing Frontiers and Innovations
The PULSAR programming model is quite simple, with point-to-point channels as the main communication abstraction. ...
The runtime implementation is very lightweight and fully distributed, and provides multithreading, messagepassing and multi-GPU offload capabilities. ...
Acknowledgements This work has been supported by the National Science Foundation, under grant SHF-1117062, Parallel Unified Linear algebra with Systolic ARrays (PULSAR). ...
doi:10.14529/jsfi170101
fatcat:b6afot42rfakxicwf6rdz7sela
5G – Wireless Communications for 2020
2016
Journal of Communication and Information Systems
demanding and varied requirements that cannot be satisfied by current networks. ...
This is due not only to the growth in data traffic and in the number of connected terminals, but also because we are on the verge of new era, where everyone and everything will be connected, with more ...
MIMO systems consist in the adoption of multiple antennas in both receiver and transmitter ends, aiming at improvements in spectral efficiency and robustness. ...
doi:10.14209/jcis.2016.14
fatcat:piyq3my3ejex7afk42qjwk6erm
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead
2020
IEEE Access
This paper first introduces the key properties of two brain-inspired models like Deep Neural Network (DNN), and Spiking Neural Network (SNN), and then analyzes techniques to produce efficient and high-performance ...
prominence to the last two solutions since they offer greater design flexibility and bear the potential of high energy-efficiency, especially for the inference process. ...
Among the numerous subroutines implemented, the BLAS also include element-wise matrix multiplication, matrix-vector multiplication and matrix-matrix multiplication, also called General Matrix Multiplication ...
doi:10.1109/access.2020.3039858
fatcat:nticzqgrznftrcji4krhyjxudu
Analyzing trajectories on Grassmann manifold for early emotion detection from depth videos
2015
2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG)
That is, a sequence of 3D faces is first split to an indexed collection of short-term sub-sequences that are represented as matrix (subspace) which define a special matrix manifold called, Grassmann manifold ...
They are respectively (1) a dictionary (of subspaces) representation associated to Dictionary Learning and Sparse Coding techniques and (2) a time-parameterized curve (trajectory) representation on the ...
The main steps of this pipeline are summa- is performed, the mean curvature is computed from each 3D frame (Fig. 4.2) . ...
doi:10.1109/fg.2015.7163122
dblp:conf/fgr/AlashkarABD15
fatcat:53lkhk6apzaqndylmu5iphqupe
CALIBRATION OF A MULTI-CAMERA ROVER
2016
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Although photogrammetric specialists realize the benefits of such systems immediately, surveyors have difficulties to find efficient usages. ...
To approach this new measurement systems the technique has to be understood and the confidence on the accuray has to grow. ...
This work has partly been funded by the EU and the Free State of Bavaria within the project "DiPhoBi4KMU". ...
doi:10.5194/isprs-archives-xli-b5-445-2016
fatcat:ufzzofmj7fdu7oky4byzfcmy6u
« Previous
Showing results 1 — 15 out of 164 results