447 Hits in 3.6 sec

A vision for GPU-accelerated parallel computation on geo-spatial datasets

Sushil K. Prasad, Michael McDermott, Satish Puri, Dhara Shah, Danial Aghajarian, Shashi Shekhar, Xun Zhou
2015 SIGSPATIAL Special  
A GPU can yield one-to-two orders of magnitude speedups and will become increasingly more affordable and energy efficient due to mass marketing for gaming.  ...  We also survey the current landscape of representative geo-spatial problems and their parallel, GPU-based solutions. 1  ...  All-to-all Floyd-Warshall shortest path algorithm or Gaussian elimination for a system of linear equations, for example, manifest regular pattern of data access when parallelized [13] .  ... 
doi:10.1145/2766196.2766200 fatcat:ayy3ozgvxvccxirmi3er65no54

A heterogeneous accelerator platform for multi-subject voxel-based brain network analysis

Yu Wang, Mo Xu, Ling Ren, Xiaorui Zhang, Di Wu, Yong He, Ningyi Xu, Huazhong Yang
2011 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)  
A promising method is to model the brain as a network based on modern imaging technologies and then to apply graph theory algorithms for analysis.  ...  In this work, we examine the computing bottleneck of this method, and propose a CPU-GPU heterogeneous platform to accelerate the process.  ...  All-Pair Shortest Paths APSP algorithms for graphs with diverse characteristics have been studied in depth.  ... 
doi:10.1109/iccad.2011.6105352 dblp:conf/iccad/WangXRZWHXY11 fatcat:n4t2e7r2bngadhrle7ica3e6ee


Jianlong Zhong, Bingsheng He
2014 SIGMOD record  
Medusa is a parallel graph processing system on graphics processors (GPUs).  ...  This simplifies the implementation of parallel graph processing on the GPU.  ...  The authors would like to thank anonymous reviewers for their valuable comments. This work is supported by a MoE AcRF Tier 2 grant (MOE2012-T2-2-067) in Singapore.  ... 
doi:10.1145/2694413.2694421 fatcat:xnflyqvcznesvi2hwbnax7ejui

Medusa: Simplified Graph Processing on GPUs

Jianlong Zhong, Bingsheng He
2014 IEEE Transactions on Parallel and Distributed Systems  
Recently, the graphics processing unit (GPU) has been adopted to accelerate various graph processing algorithms such as BFS and shortest paths.  ...  We develop a series of graph-centric optimizations based on the architecture features of GPUs for efficiency. Additionally, Medusa is extended to execute on multiple GPUs within a machine.  ...  ACKNOWLEDGEMENT The authors would like to thank the anonymous reviewers for their valuable comments, and Pawan Harish for providing the source code for CUDA-based BFS and shortest paths.  ... 
doi:10.1109/tpds.2013.111 fatcat:nxyliksc55ar3kn5sn73k22x5u

Benchmarking Graph Data Management and Processing Systems: A Survey [article]

Miyuru Dayarathna, Toyotaro Suzumura
2021 arXiv   pre-print
The development of scalable, representative, and widely adopted benchmarks for graph data systems have been a question for which answers has been sought for decades.  ...  We categorize the benchmarks into three areas focusing on benchmarks for graph processing systems, graph database benchmarks, and bigdata benchmarks with graph processing workloads.  ...  They have run PageRank, Betweenness-centrality (BC), and All Pairs Shortest Path (APSP) over a Google web graph as well as over a citations-Patents graph.  ... 
arXiv:2005.12873v4 fatcat:jh3367b4vjaqbgyvaccjnxqjfi

Extreme Big Data (EBD): Next Generation Big Data Infrastructure Technologies Towards Yottabyte/Year

2014 Supercomputing Frontiers and Innovations  
There are various predictions on "breaking down of silos" where organizations will open up their data for public consumption, either for free or for a fee, along with immense increase in varieties of data  ...  Although the project is still early in its lifetime, started in Oct. 2013, we have already achieved several notable results, including becoming world #1 on the Green Graph 500, a benchmark to measure the  ...  Historically a minimal or shortest-path deadlock-free routing has been used for interconnection networks.  ... 
doi:10.14529/jsfi140206 fatcat:fxzoyb3hgzaallrbrs4jkks2im

Briefing: High-performance computing for city-scale modelling and simulations

Kenichi Soga, Gerard Casey, Krishna Kumar, Bingyu Zhao
2017 Proceedings of the Institution of Civil Engineers - Smart Infrastructure and Construction  
becoming possible thanks to a surge of development in the high-performance computing (HPC) domain including advanced hardware, computational and algorithmic techniques such as domain decomposition across multi-GPUs  ...  Macro-scale events such as earthquakes influence the weights of the edges on a graph network (e.g. reduced road capacity/road closures update the weights of the edges/ removal of an edge), which in turn  ...  However, graph algorithms such as shortest path queries remain mathematically hard problems. GPU and algorithmic implementations show the most promise to solve these problems.  ... 
doi:10.1680/jsmic.17.00026 fatcat:iejlrxaprvelfkud5noakchnde

Speeding Up Network Layout and Centrality Measures for Social Computing Goals [chapter]

Puneet Sharma, Udayan Khurana, Ben Shneiderman, Max Scharrenbroich, John Locke
2011 Lecture Notes in Computer Science  
With the growth in adoption of SNA in different domains and increasing availability of huge networked datasets for analysis, social network analysts require faster tools that are also scalable.  ...  Our results, using NodeXL, show up to 802 times speedup for a Fruchterman-Rheingold graph layout and up to 17,972 times speedup for Eigenvector centrality metric calculations on a 240 core CUDA-capable  ...  Alan Sussman for their guidance through the course of this project. We are grateful to Microsoft External Research for funding the NodeXL project at University of Maryland.  ... 
doi:10.1007/978-3-642-19656-0_35 fatcat:7exuenfebrazdeud7tsd3p6wsu

An analysis of the graph processing landscape [article]

Miguel E. Coimbra, Alexandre P. Francisco, Luís Veiga
2021 arXiv   pre-print
and different definitions related to the potential for a graph to be updated.  ...  The use-case of performing global computations over a graph, it is first ingested into a graph processing system from one of many digital representations.  ...  single-source shortest-paths.  ... 
arXiv:1911.11624v3 fatcat:t44dfa5cvfbk7exz4s2synm5z4

Implementation of MapReduce parallel computing framework based on multi-data fusion sensors and GPU cluster

Dajun Chang, Li Li, Ying Chang, Zhangquan Qiao
2021 EURASIP Journal on Advances in Signal Processing  
This experimental environment uses a Hadoop fully distributed cluster environment, and the entire programming of the single-source shortest path algorithm based on MapReduce is implemented in Java language  ...  This article mainly studies the MapReduce parallel computing framework based on multiple data fusion sensors and GPU clusters.  ...  All authors read and approved the final manuscript.  ... 
doi:10.1186/s13634-021-00787-7 fatcat:opqxux5c2nekrk2o7zb3mt44uq

An Overview of Efficient Interconnection Networks for Deep Neural Network Accelerators

Seyed Morteza Nabavinejad, Mohammad Baharloo, Kun-Chih Chen, Maurizio Palesi, Tim Kogel, Masoumeh Ebrahimi
2020 IEEE Journal on Emerging and Selected Topics in Circuits and Systems  
As a result, efficient interconnection and data movement mechanisms for future on-chip artificial intelligence (AI) accelerators are worthy of study.  ...  However, due to the massive parallel processing, the performance of the current large-scale artificial neural network is often limited by the huge communication overheads and storage requirements.  ...  To address this challenge, NVLink interconnect is introduced for multi-GPU clusters. • NVLink: NVLink is one of the well-known interconnect interfaces proposed for multi-GPU computing.  ... 
doi:10.1109/jetcas.2020.3022920 fatcat:idqitgwnrnegbd4dhrly3xsxbi

Efficient Radial Pattern Keyword Search on Knowledge Graphs in Parallel [article]

Yueji Yang, Anthony K. H. Tung
2020 arXiv   pre-print
Recently, keyword search on Knowledge Graphs (KGs) becomes popular.  ...  The connection paths between keywords are selected in a way that leads to a result subgraph with a better semantic score.  ...  These works all try to find a tree-shaped answer that closely relates and covers all keywords. [7, 13, 19] provide disk solutions for huge graphs.  ... 
arXiv:2001.06770v1 fatcat:tyww7njvkngixptavayqbkd3ga

Literature Survey On Clustering Techniques

B.G.Obula Reddy
2012 IOSR Journal of Computer Engineering  
clustering methods by taking some example for each classification.  ...  Clustering is the assignment of data objects (records) into groups (called clusters) so that data objects from the same cluster are more similar to each other than objects from different clusters.  ...  Rupali Mankar and Ms.Amruta Faude who have also contributed a great deal for the initial completion of the work.  ... 
doi:10.9790/0661-0310112 fatcat:cflw4gkk6fgqxf6xoyztofhqfa

Better speedups using simpler parallel programming for graph connectivity and biconnectivity

James A. Edwards, Uzi Vishkin
2012 Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores - PMAM '12  
For graph connectivity, we demonstrate that XMT outperforms two recent NVIDIA GPUs of similar or greater silicon area.  ...  Speedups demonstrated for finding the biconnected components of a graph: 9x to 33x on the Explicit Multi-Threading (XMT) many-core computing platform relative to the best serial algorithm using a relatively  ...  The diameter of a connected graph is the length of the longest path in the set of all shortest paths between every pair of vertices in the graph.  ... 
doi:10.1145/2141702.2141714 dblp:conf/ppopp/EdwardsV12 fatcat:mkkj5d7id5etzibhxbxqd54fhu

High-throughput and scalable protein function identification with Hadoop and Map-only pattern of the MapReduce processing model

Dariusz Mrozek, Marek Suwała, Bożena Małysiak-Mrozek
2018 Knowledge and Information Systems  
a number of usage scenarios, including comparison of pairs of 3D protein structures during evaluation of predicted protein models, one-to-many comparisons while identifying possible functions of the given  ...  In this paper, we show how the protein function identification and finding structural homologs can be efficiently accelerated with the use of the MapReduce procedure executed on Hadoop cluster established  ...  The GPU-CASSERT outperformed all approaches with the average execution time 1.76183E−05 s per compared pair of protein chains.  ... 
doi:10.1007/s10115-018-1245-3 fatcat:tfqkm54aavgp7e3y22yvqtfgx4
« Previous Showing results 1 — 15 out of 447 results