Filters








122,492 Hits in 6.6 sec

Performance analysis issues for parallel implementations of propagation algorithm

L. Brenner, L.G. Fernandes, P. Fernandes, A. Sales
Proceedings. 15th Symposium on Computer Architecture and High Performance Computing  
This paper presents a theoretical study to evaluate the performance of a family of parallel implementations of the propagation algorithm.  ...  The theoretical performance analysis is based on the construction of generic models using Stochastic Automata Networks (SAN) formalism to describe each implementation scheme.  ...  Parallel Propagation Implementation The parallel implementation for the propagation algorithm discussed on this section was developed in order to allow the use of this new algorithm on real situations.  ... 
doi:10.1109/cahpc.2003.1250337 dblp:conf/sbac-pad/BrennerFFS03 fatcat:sln6kh62offxjbsuu2gazr2ehq

BioGraphE: high-performance bionetwork analysis using the Biological Graph Environment

George Chin, Daniel G Chavarria, Grant C Nakamura, Heidi J Sofia
2008 BMC Bioinformatics  
Conclusion: In our application of BioGraphE to conduct bionetwork analysis of homology networks, we found that BioGraphE and a custom, parallel implementation of the Survey Propagation SAT solver were  ...  Many traditional graph algorithms such as k-clique, k-coloring, and subgraph matching have great potential as analysis techniques for newly available data in biology.  ...  Dept. of Energy Office of Advanced Scientific Computing Research. PNNL is operated by Battelle for the U. S. Dept. of Energy.  ... 
doi:10.1186/1471-2105-9-s6-s6 pmid:18541059 pmcid:PMC2423447 fatcat:if26b5acl5cl3igc7nq5cxbhti

Software Design Challenges in Time Series Prediction Systems Using Parallel Implementation of Artificial Neural Networks

Narayanan Manikandan, Srinivasan Subha
2016 The Scientific World Journal  
This framework is tested for finding the accuracy and performance of parallel algorithms used.  ...  This paper addressed some architectural design related issues for performance improvement through vectorising the strengths of multivariate econometric time series models and Artificial Neural Networks  ...  Conflict of Interests The authors declare that there is no conflict of interests regarding the publication of this paper.  ... 
doi:10.1155/2016/6709352 pmid:26881271 pmcid:PMC4735934 fatcat:rckeurpilvh2znyz7olr4tuile

Scalable HMM based inference engine in large vocabulary continuous speech recognition

Jike Chong, Kisun You, Youngmin Yi, Ekaterina Gonina, Christopher Hughes, Wonyong Sung, Kurt Keutzer
2009 2009 IEEE International Conference on Multimedia and Expo  
The highest performing algorithm style varies with the implementation platform.  ...  We propose four application-level implementation alternatives we call "algorithm styles", and construct highly optimized implementations on two parallel platforms: an Intel Core i7 multicore processor  ...  Fig. 1 . 1 Architecture of large vocabulary continuous speech recognition Fig. 2 . 2 The algorithmic level design space for graph traversal scalabiliy analysis for the inference engine Fig. 3 . 3 Ratio  ... 
doi:10.1109/icme.2009.5202871 dblp:conf/icmcs/ChongYYGHSK09 fatcat:f7xpdimcwbam3nl2atjdfktbd4

GPU-Accelerated Foreground Segmentation and Labeling for Real-Time Video Surveillance

Wei Song, Yifei Tian, Simon Fong, Kyungeun Cho, Wei Wang, Weiqiang Zhang
2016 Sustainability  
The proposed GPU-based image processing algorithms are implemented using the compute unified device architecture (CUDA) toolkit.  ...  For real-time moving object detection in video, this paper applies a parallel computing technology to develop a feedback foreground-background segmentation method and a parallel connected component labeling  ...  Figure 3 . 3 The data-dependent issue of the propagation process in the graphics processing unit (GPU)based CCL algorithm.  ... 
doi:10.3390/su8100916 fatcat:xxrdaewqljeqfbwogrftqfatey

Performance Models For Master/Slave Parallel Programs

Lucas Baldo, Leonardo Brenner, Luiz Gustavo Fernandes, Paulo Fernandes, Afonso Sales
2005 Electronical Notes in Theoretical Computer Science  
Although the SAN models may help the pre-analysis of implementations, the main contribution of this paper is to point out advantages and problems of the proposed modeling technique.  ...  This paper proposes the use of Stochastic Automata Networks (SAN) to develop models that can be efficiently applied to a large class of parallel implementations: master/slave (m/s) programs.  ...  Section 5 shows the SAN models describing the possible implementations of the Propagation Algorithm. Section 6 presents the analysis of performance indices for the presented models.  ... 
doi:10.1016/j.entcs.2005.01.015 fatcat:f52m36ce3vb45er3kwp6l4sqb4

Actor-Based Parallel Dataflow Analysis [chapter]

Jonathan Rodriguez, Ondřej Lhoták
2011 Lecture Notes in Computer Science  
We implement the algorithm in Scala, and evaluate its performance against a comparable sequential algorithm.  ...  We conclude that Actors are an effective way to parallelize this type of algorithm.  ...  Acknowledgements This work was supported, in part, by the Natural Sciences and Engineering Research Council of Canada.  ... 
doi:10.1007/978-3-642-19861-8_11 fatcat:e66r5dgaszg5tpbhnzo7hc3rym

Belief Propagation by Message Passing in Junction Trees: Computing Each Message Faster Using GPU Parallelization [article]

Lu Zheng, Ole Mengshoel, Jike Chong
2012 arXiv   pre-print
Experimentally, we study how junction tree parameters affect parallelization opportunities and hence the performance of our algorithm.  ...  We develop data structures and algorithms that extend existing junction tree techniques, and specifically develop a novel approach to computing each belief propagation message in parallel.  ...  We develop a parallel message computation algorithm for junction tree belief propagation. The speedup of this parallel algorithm, relative to the sequential algorithm, is analyzed theoretically.  ... 
arXiv:1202.3777v1 fatcat:xv5yea6ozzdbpfhseqry6xrutm

Parallel Algorithm with Modulus Structure for Simulation of Seismic Wave Propagation in 3D Multiscale Multiphysics Media [chapter]

Victor Kostin, Vadim Lisitsa, Galina Reshetova, Vladimir Tcheverda
2017 Lecture Notes in Computer Science  
Gorlatch Properties of the conservative parallel discrete event simulation algorithm Liliia Ziganurova, Lev Shchur LUNCH Meeting of the Program Committee Papers selection for the special issue of an Int  ...  of graph models to the parallel algorithms design for the motion simulation of tethered satellite systems Alexandr Kovartsev, Victor Zhidchenko 16:25- 16:45 Parallel algorithm for solving constrained  ...  The 14 th International Conference on Parallel Computing Technologies (PaCT-2017) The Best Paper Award is sponsored by Springer  ... 
doi:10.1007/978-3-319-62932-2_4 fatcat:3floyq6ajjeb5aqf5ehcgnu7qe

Adaptive Region Construction for Efficient Use of Radio Propagation Maps

Vinay B. Ramakrishnaiah, Suresh S. Muknahallipatna, Robert F. Kubichek
2017 Journal of Computer and Communications  
Simulations are performed with varying sizes of radio propagation maps, and the suitability of the ARC technique for real-time operation is presented.  ...  Next, the process of implementing the ARC technique for real-time execution on a GPU is presented.  ...  Profiling Analysis of Naive Version The NVIDIA Visual Profiler [20] is a cross-platform performance analysis tool that provides guidance for optimizing CUDA applications.  ... 
doi:10.4236/jcc.2017.58003 fatcat:zbnkirrzyjdnhabjv33woeto3y

Performance analysis of a parallel program for wave propagation simulation [chapter]

Michel Pahud, Frédéric Guidec, Thierry Cornu
1997 Lecture Notes in Computer Science  
The method is used for predicting the performances of ParFlow++, an irregular, parallel radio-wave propagation algorithm. 2 W a v e p r o p a g a t i o n w i t h P a r F l o w -t --b  ...  This paper presents a method for achieving performance analysis for parallel irregular applications. The model is closely related to the Bulk Synchronous Programming (BSP) model [4] .  ...  Conclusion The performance model for a parallel, irregular application presented is based on the BSP model [4] , known for its ability to represent regular algorithms.  ... 
doi:10.1007/bfb0002848 fatcat:y5b5tdxwnjaala4djsr767xxne

Morph algorithms on GPUs

Rupesh Nasre, Martin Burtscher, Keshav Pingali
2013 Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '13  
We propose efficient techniques to perform concurrent subgraph addition, subgraph deletion, conflict detection and several optimizations to improve the scalability of morph algorithms.  ...  iii) a compiler analysis called Points-to Analysis (PTA), and (iv) Boruvka's Minimum Spanning Tree algorithm (MST).  ...  We introduce Delaunay Mesh Refinement, Survey Propagation, Points-to Analysis and Boruvka's Minimum Spanning Tree algorithm and discuss the sources of parallelism in these algorithms in Sections 2 through  ... 
doi:10.1145/2442516.2442531 dblp:conf/ppopp/NasreBP13 fatcat:xage2i3klbcrfbxvjmlmo2ejum

Parallel scalability in speech recognition

Kisun You, Jike Chong, Youngmin Yi, Ekaterina Gonina, Christopher Hughes, Yen-Kuang Chen, Wonyong Sung, Kurt Keutzer
2009 IEEE Signal Processing Magazine  
The highest performing algorithm style varies with the implementation platform.  ...  Our implementation of the inference engine involves a parallel graph traversal through an irregular graph-based knowledge network with millions of states and arcs.  ...  The authors also thank NVIDIA for donating the hardware used. This  ... 
doi:10.1109/msp.2009.934124 fatcat:jfqfjdhpjbfz5ehnxefvlkjeaq

Morph algorithms on GPUs

Rupesh Nasre, Martin Burtscher, Keshav Pingali
2013 SIGPLAN notices  
We propose efficient techniques to perform concurrent subgraph addition, subgraph deletion, conflict detection and several optimizations to improve the scalability of morph algorithms.  ...  iii) a compiler analysis called Points-to Analysis (PTA), and (iv) Boruvka's Minimum Spanning Tree algorithm (MST).  ...  We introduce Delaunay Mesh Refinement, Survey Propagation, Points-to Analysis and Boruvka's Minimum Spanning Tree algorithm and discuss the sources of parallelism in these algorithms in Sections 2 through  ... 
doi:10.1145/2517327.2442531 fatcat:nnwk3had3jertioz55faopmy4a

Informed Dynamic Scheduling for Belief-Propagation Decoding of LDPC Codes [article]

Andres I. Vila Casado, Miguel Griot, Richard D. Wesel
2007 arXiv   pre-print
Therefore, it also outperforms traditional scheduling for a large numbers of iterations. Complexity and implementability issues are also addressed.  ...  Low-Density Parity-Check (LDPC) codes are usually decoded by running an iterative belief-propagation, or message-passing, algorithm over the factor graph of the code.  ...  Also, an analysis of the hardware issues that may arise in a parallel implementation of these informed sequential scheduling strategies is presented. This paper is organized as follows.  ... 
arXiv:cs/0702111v2 fatcat:qtkqfaxlrfhntbz7cgmoa3lv3q
« Previous Showing results 1 — 15 out of 122,492 results