A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2015; you can also visit the original URL.
The file type is application/pdf
.
Filters
Parallel Overlapping Community Detection with SLPA
2013
2013 International Conference on Social Computing
We show that despite of irregular data dependencies in the computation, parallel computing paradigms can significantly speed up the detection of overlapping communities of social networks which is computationally ...
We show by experiments, how various parallel computing architectures can be utilized to analyze large social network data on both shared memory machines and distributed memory machines, such as IBM Blue ...
The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory ...
doi:10.1109/socialcom.2013.37
dblp:conf/socialcom/KuzminSS13
fatcat:a33dufnjtra5rd2opa2to3wssy
Overlapping Communication with Computation Using OpenMP Tasks on the GTS Magnetic Fusion Code
2010
Scientific Programming
We study how to include new advanced hybrid models, which extend the applicability of OpenMP tasks and exploit multi-threaded MPI support to overlap communication and computation. ...
We take an important magnetic fusion particle code that already includes several levels of parallelism including hybrid MPI combined with OpenMP. ...
. / Overlapping communication with computation using OpenMP tasks on the GTS magnetic fusion code
151 with John Shalf and Nicholas Wright and for the extended computer time as well as the valuable support ...
doi:10.1155/2010/951739
fatcat:cus7fhk7qfbcjhvv5t6uyiei3u
An Imbalanced Dataset and Class Overlapping Classification Model for Big Data
2023
Computer systems science and engineering
In this paper, we have proposed a parallel mode method using SMOTE and MapReduce strategy, this distributes the operation of the algorithm among a group of computational nodes for addressing the aforementioned ...
When big data is used in the real-world applications, two data challenges such as class overlap and class imbalance arises. ...
Extraction of information from certain massive data sources is a significant challenge for a majority of conventional machine learning techniques. ...
doi:10.32604/csse.2023.024277
fatcat:7ygqcewpezdklakx3g3kdn5dd4
mpi4jax: Zero-copy MPI communication of JAX arrays
2021
Journal of Open Source Software
However, machine learning and (high-performance) scientific computing are often conducted on different hardware stacks: Machine learning is typically done on few highly parallel units (GPUs or TPUs) connected ...
With a combination of NumPy (Harris et al., 2020) and mpi4py (Dalcín et al., 2005) , Python users can already build massively parallel applications without delving into low-level programming languages, ...
Acknowledgements We thank all JAX developers, in particular Matthew Johnson and Peter Hawkins, for their outstanding support on the many issues we opened. ...
doi:10.21105/joss.03419
fatcat:puyovjiuzbgovpukwcyrl7zp7i
An online spike detection and spike classification algorithm capable of instantaneous resolution of overlapping spikes
2009
Journal of Computational Neuroscience
In particular, it is desirable to have an algorithm that operates online, detects and classifies overlapping spikes in real time, and that adapts to non-stationary data. ...
Acknowledgements This research was supported by the Federal Ministry of Education and Research (BMBF) with the grants 01GQ0743 and 01GQ0410. We thank Sven Dähne for technical support. ...
The optimal linear filter was included into the evaluation to provide an upper bound on the performance one can achieve with our method. ...
doi:10.1007/s10827-009-0163-5
pmid:19499318
pmcid:PMC2950077
fatcat:qmohn2iahrg6ffm5sykefbe37q
pioman: A Pthread-Based Multithreaded Communication Engine
2015
2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing
Moreover, the high number of cores brings new opportunities to parallelize communication libraries, so as to have proper background progression of communication and communication/computation overlap. ...
This imposes new requirements on communication libraries, such as the need for MPI_THREAD_MULTIPLE level of multi-threading support. ...
It is defined as overlap = T (computation + communication)/T (computation) Figure 8 details the overlap ratio with computation only on sender side (graph on left), and on receiver side (right). ...
doi:10.1109/pdp.2015.78
dblp:conf/pdp/Denis15
fatcat:7snp4rhezzc6fb4yz5rtsa47py
Multi-million particle molecular dynamics
1993
Computer Physics Communications
This paper describes an implementation of a parallel molecular dynamics algorithm on the CM2 Connection Machine that is designed for large-scale simulations. ...
All communication is between adjacent processing elements, eliminating the need for global communication. Performance measurements were made with systems containing over 106 particles. ...
Acknowledgements The author would like to thank the Supercomputer Computations Research Institute at Florida State University for its hospitality while this study was being carried out. ...
doi:10.1016/0010-4655(93)90058-k
fatcat:26bw5zetovcshp3isehnfap5ci
Matlab and Parallel Computing
2012
Image Processing & Communications
Now MATLAB is enriched by the possibility of parallel computing with the Parallel Computing ToolboxTM and MATLAB Distributed Computing ServerTM. ...
In this article we present some of the key features of MATLAB parallel applications focused on using GPU processors for image processing. ...
CUDA implements parallel computing on the massive number of processors of a GPU for rather simple floating-point calculations with very fast communication between processors. ...
doi:10.2478/v10248-012-0048-5
fatcat:fax6h3gnn5ayzmixifkxnjrg5a
Advanced theory and practice for high performance computing and communications
2011
Concurrency and Computation
The software models span HPF and data parallelism, to distributed information systems and object/data ow parallelism on the Web. ...
In Section 3, we show how the di erent problem categories or architectures are addressed by parallel software systems with di erent capabilities. ...
This can be contrasted with Otto and Felten's MIMD computer chess algorithm, where the asynchronous evaluation of the pruned tree is \massively parallel" Felten:88i]. ...
doi:10.1002/cpe.1863
fatcat:dxzlqgakunhbpdkabvaktvwbay
Static Compilation Analysis for Host-Accelerator Communication Optimization
[chapter]
2013
Lecture Notes in Computer Science
We obtain an average speedup of 4 to 5 when compared to a naïve parallelization using a modern gpu with Par4All, hmpp, and pgi, and 3.5 when compared to an OpenMP version using a 12-core multiprocessor ...
We present experimental results obtained with the Polybench 2.0, some Rodinia benchmarks, and with a real numerical simulation. ...
Introduction Hybrid computers based on hardware accelerators are growing as a preferred method to improve performance of massively parallel software. ...
doi:10.1007/978-3-642-36036-7_16
fatcat:jpftk6kotjbgtnq2pux3ep7wly
A Static Approach for Compiling Communications in Paranel Scientific Programs
1995
Scientific Programming
On most massively parallel architectures, the actual communication performance remains much less than the hardware capabilities. ...
The performance of the model is evaluated, showing that performance is better in static cases and gracefully degrades with the growing complexity and dynamic aspect of the communication patterns. ...
For instance" a blocked algorithm with block data distribution will provide few comrnunications: if the communications are not overlapped with the computations. r-t will give the actual performance in ...
doi:10.1155/1995/397320
fatcat:akfqfkr3ivgxzfjpgd2kuyngou
A high-level characterisation and generalisation of communication-avoiding programming techniques
[article]
2019
arXiv
pre-print
To do so, it has to manage to move the compute data into the compute facilities on time. ...
As communication and memory bandwidth cannot keep pace with the growth in compute capabilities and as latency increases---at least relative to what the hardware could do---communication-avoiding techniques ...
Many computations-preferable of the same type-however tend to yield a high concurrency, while very few calculations, in the "worst case" with tight data dependecies, cannot make use of a parallel machine ...
arXiv:1909.10853v2
fatcat:72wuro6bhjhmfnilmvulwx4eqm
Parallelizing SLPA for Scalable Overlapping Community Detection
2015
Scientific Programming
SLPA provides near linear time overlapping community detection and is well suited for parallelization. ...
The algorithm was tested on four real-world datasets with up to 5.5 million nodes and 170 million edges. ...
The performance is evaluated on two different threaded hardware architectures: a multiprocessor multicore Intel-based server and massively multithreaded Cray XMT2. ...
doi:10.1155/2015/461362
fatcat:qlqxzs46j5ecriseanfonpja7m
Improving communication scheduling for array redistribution
2005
Journal of Parallel and Distributed Computing
Many scientific applications require array redistribution when the programs run on distributed memory parallel computers. ...
message lengths during one particular communication step. ...
Our method can be extended to multi-dimensional arrays easily. We have implemented our algorithms on a massively parallel MIMD machine. ...
doi:10.1016/j.jpdc.2004.12.001
fatcat:csvulniqdbhqbej6n7tkzzxwoy
GPU-accelerated evolutionary design of the complete exchange communication on wormhole networks
2014
Proceedings of the 2014 conference on Genetic and evolutionary computation - GECCO '14
The communication overhead is one of the main challenges in the exascale era, where millions of compute cores are expected to collaborate on solving complex jobs. ...
Unfortunately, the execution time associated with the evolution process raises up to tens of hours, even when being run on a multi-core processor. ...
GPU Architecture and Programming Graphics Processing Units (GPUs) are massively parallel accelerators primarily targeted on speeding up the computer graphics with millions of independent polygons and pixels ...
doi:10.1145/2576768.2598315
dblp:conf/gecco/JarosT14
fatcat:feoprmdpgndrzpoqgw7g6ujo5a
« Previous
Showing results 1 — 15 out of 13,313 results