13,313 Hits in 7.3 sec

Parallel Overlapping Community Detection with SLPA

Konstantin Kuzmin, S. Yousaf Shah, Boleslaw K. Szymanski
2013 2013 International Conference on Social Computing  
We show that despite of irregular data dependencies in the computation, parallel computing paradigms can significantly speed up the detection of overlapping communities of social networks which is computationally  ...  We show by experiments, how various parallel computing architectures can be utilized to analyze large social network data on both shared memory machines and distributed memory machines, such as IBM Blue  ...  The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory  ... 
doi:10.1109/socialcom.2013.37 dblp:conf/socialcom/KuzminSS13 fatcat:a33dufnjtra5rd2opa2to3wssy

Overlapping Communication with Computation Using OpenMP Tasks on the GTS Magnetic Fusion Code

Robert Preissl, Alice Koniges, Stephan Ethier, Weixing Wang, Nathan Wichmann
2010 Scientific Programming  
We study how to include new advanced hybrid models, which extend the applicability of OpenMP tasks and exploit multi-threaded MPI support to overlap communication and computation.  ...  We take an important magnetic fusion particle code that already includes several levels of parallelism including hybrid MPI combined with OpenMP.  ...  . / Overlapping communication with computation using OpenMP tasks on the GTS magnetic fusion code 151 with John Shalf and Nicholas Wright and for the extended computer time as well as the valuable support  ... 
doi:10.1155/2010/951739 fatcat:cus7fhk7qfbcjhvv5t6uyiei3u

An Imbalanced Dataset and Class Overlapping Classification Model for Big Data

Mini Prince, P. M. Joe Prathap
2023 Computer systems science and engineering  
In this paper, we have proposed a parallel mode method using SMOTE and MapReduce strategy, this distributes the operation of the algorithm among a group of computational nodes for addressing the aforementioned  ...  When big data is used in the real-world applications, two data challenges such as class overlap and class imbalance arises.  ...  Extraction of information from certain massive data sources is a significant challenge for a majority of conventional machine learning techniques.  ... 
doi:10.32604/csse.2023.024277 fatcat:7ygqcewpezdklakx3g3kdn5dd4

mpi4jax: Zero-copy MPI communication of JAX arrays

Dion Häfner, Filippo Vicentini
2021 Journal of Open Source Software  
However, machine learning and (high-performance) scientific computing are often conducted on different hardware stacks: Machine learning is typically done on few highly parallel units (GPUs or TPUs) connected  ...  With a combination of NumPy (Harris et al., 2020) and mpi4py (Dalcín et al., 2005) , Python users can already build massively parallel applications without delving into low-level programming languages,  ...  Acknowledgements We thank all JAX developers, in particular Matthew Johnson and Peter Hawkins, for their outstanding support on the many issues we opened.  ... 
doi:10.21105/joss.03419 fatcat:puyovjiuzbgovpukwcyrl7zp7i

An online spike detection and spike classification algorithm capable of instantaneous resolution of overlapping spikes

Felix Franke, Michal Natora, Clemens Boucsein, Matthias H. J. Munk, Klaus Obermayer
2009 Journal of Computational Neuroscience  
In particular, it is desirable to have an algorithm that operates online, detects and classifies overlapping spikes in real time, and that adapts to non-stationary data.  ...  Acknowledgements This research was supported by the Federal Ministry of Education and Research (BMBF) with the grants 01GQ0743 and 01GQ0410. We thank Sven Dähne for technical support.  ...  The optimal linear filter was included into the evaluation to provide an upper bound on the performance one can achieve with our method.  ... 
doi:10.1007/s10827-009-0163-5 pmid:19499318 pmcid:PMC2950077 fatcat:qmohn2iahrg6ffm5sykefbe37q

pioman: A Pthread-Based Multithreaded Communication Engine

Alexandre Denis
2015 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing  
Moreover, the high number of cores brings new opportunities to parallelize communication libraries, so as to have proper background progression of communication and communication/computation overlap.  ...  This imposes new requirements on communication libraries, such as the need for MPI_THREAD_MULTIPLE level of multi-threading support.  ...  It is defined as overlap = T (computation + communication)/T (computation) Figure 8 details the overlap ratio with computation only on sender side (graph on left), and on receiver side (right).  ... 
doi:10.1109/pdp.2015.78 dblp:conf/pdp/Denis15 fatcat:7snp4rhezzc6fb4yz5rtsa47py

Multi-million particle molecular dynamics

D.C. Rapaport
1993 Computer Physics Communications  
This paper describes an implementation of a parallel molecular dynamics algorithm on the CM2 Connection Machine that is designed for large-scale simulations.  ...  All communication is between adjacent processing elements, eliminating the need for global communication. Performance measurements were made with systems containing over 106 particles.  ...  Acknowledgements The author would like to thank the Supercomputer Computations Research Institute at Florida State University for its hospitality while this study was being carried out.  ... 
doi:10.1016/0010-4655(93)90058-k fatcat:26bw5zetovcshp3isehnfap5ci

Matlab and Parallel Computing

Magdalena Szymczyk, Piotr Szymczyk
2012 Image Processing & Communications  
Now MATLAB is enriched by the possibility of parallel computing with the Parallel Computing ToolboxTM and MATLAB Distributed Computing ServerTM.  ...  In this article we present some of the key features of MATLAB parallel applications focused on using GPU processors for image processing.  ...  CUDA implements parallel computing on the massive number of processors of a GPU for rather simple floating-point calculations with very fast communication between processors.  ... 
doi:10.2478/v10248-012-0048-5 fatcat:fax6h3gnn5ayzmixifkxnjrg5a

Advanced theory and practice for high performance computing and communications

Geoffrey Fox
2011 Concurrency and Computation  
The software models span HPF and data parallelism, to distributed information systems and object/data ow parallelism on the Web.  ...  In Section 3, we show how the di erent problem categories or architectures are addressed by parallel software systems with di erent capabilities.  ...  This can be contrasted with Otto and Felten's MIMD computer chess algorithm, where the asynchronous evaluation of the pruned tree is \massively parallel" Felten:88i].  ... 
doi:10.1002/cpe.1863 fatcat:dxzlqgakunhbpdkabvaktvwbay

Static Compilation Analysis for Host-Accelerator Communication Optimization [chapter]

Mehdi Amini, Fabien Coelho, François Irigoin, Ronan Keryell
2013 Lecture Notes in Computer Science  
We obtain an average speedup of 4 to 5 when compared to a naïve parallelization using a modern gpu with Par4All, hmpp, and pgi, and 3.5 when compared to an OpenMP version using a 12-core multiprocessor  ...  We present experimental results obtained with the Polybench 2.0, some Rodinia benchmarks, and with a real numerical simulation.  ...  Introduction Hybrid computers based on hardware accelerators are growing as a preferred method to improve performance of massively parallel software.  ... 
doi:10.1007/978-3-642-36036-7_16 fatcat:jpftk6kotjbgtnq2pux3ep7wly

A Static Approach for Compiling Communications in Paranel Scientific Programs

Damien Gautier De Lahaut, CÉcile Germain
1995 Scientific Programming  
On most massively parallel architectures, the actual communication performance remains much less than the hardware capabilities.  ...  The performance of the model is evaluated, showing that performance is better in static cases and gracefully degrades with the growing complexity and dynamic aspect of the communication patterns.  ...  For instance" a blocked algorithm with block data distribution will provide few comrnunications: if the communications are not overlapped with the computations. r-t will give the actual performance in  ... 
doi:10.1155/1995/397320 fatcat:akfqfkr3ivgxzfjpgd2kuyngou

A high-level characterisation and generalisation of communication-avoiding programming techniques [article]

Tobias Weinzierl
2019 arXiv   pre-print
To do so, it has to manage to move the compute data into the compute facilities on time.  ...  As communication and memory bandwidth cannot keep pace with the growth in compute capabilities and as latency increases---at least relative to what the hardware could do---communication-avoiding techniques  ...  Many computations-preferable of the same type-however tend to yield a high concurrency, while very few calculations, in the "worst case" with tight data dependecies, cannot make use of a parallel machine  ... 
arXiv:1909.10853v2 fatcat:72wuro6bhjhmfnilmvulwx4eqm

Parallelizing SLPA for Scalable Overlapping Community Detection

Konstantin Kuzmin, Mingming Chen, Boleslaw K. Szymanski
2015 Scientific Programming  
SLPA provides near linear time overlapping community detection and is well suited for parallelization.  ...  The algorithm was tested on four real-world datasets with up to 5.5 million nodes and 170 million edges.  ...  The performance is evaluated on two different threaded hardware architectures: a multiprocessor multicore Intel-based server and massively multithreaded Cray XMT2.  ... 
doi:10.1155/2015/461362 fatcat:qlqxzs46j5ecriseanfonpja7m

Improving communication scheduling for array redistribution

Minyi Guo, Yi Pan
2005 Journal of Parallel and Distributed Computing  
Many scientific applications require array redistribution when the programs run on distributed memory parallel computers.  ...  message lengths during one particular communication step.  ...  Our method can be extended to multi-dimensional arrays easily. We have implemented our algorithms on a massively parallel MIMD machine.  ... 
doi:10.1016/j.jpdc.2004.12.001 fatcat:csvulniqdbhqbej6n7tkzzxwoy

GPU-accelerated evolutionary design of the complete exchange communication on wormhole networks

Jiri Jaros, Radek Tyrala
2014 Proceedings of the 2014 conference on Genetic and evolutionary computation - GECCO '14  
The communication overhead is one of the main challenges in the exascale era, where millions of compute cores are expected to collaborate on solving complex jobs.  ...  Unfortunately, the execution time associated with the evolution process raises up to tens of hours, even when being run on a multi-core processor.  ...  GPU Architecture and Programming Graphics Processing Units (GPUs) are massively parallel accelerators primarily targeted on speeding up the computer graphics with millions of independent polygons and pixels  ... 
doi:10.1145/2576768.2598315 dblp:conf/gecco/JarosT14 fatcat:feoprmdpgndrzpoqgw7g6ujo5a
« Previous Showing results 1 — 15 out of 13,313 results