Filters








77 Hits in 9.7 sec

Midpoint routing algorithms for Delaunay triangulations

Weisheng Si, Albert Y. Zomaya
2010 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)  
We consider single-and double-precision with numerous performance enhancements, including low-level tuning, numerical approximation, data structure transformations, OpenMP parallelization, and algorithmic  ...  In this case we see the applicability of the proposed method for debugging. Identifying Ad-hoc Synchronization for Enhanced Race Detection Ali Jannesari and Walter F.  ...  on the Cell Processor, the MTA-8, and the NVIDIA GeForce 200 series.  ... 
doi:10.1109/ipdps.2010.5470471 dblp:conf/ipps/SiZ10 fatcat:yuchdc4zp5borm5vs7j4rqgmzy

Discovery of Potential Parallelism in Sequential Programs

Zhen Li, Ali Jannesari, Felix Wolf
2013 2013 42nd International Conference on Parallel Processing  
Zia Ul Huda, a sincere friend who shared the same office with me for more than three years, made key contributions to DiscoPoP on parallel pattern detection.  ...  A program is represented as a CU graph, in which vertexes are CUs and edges are data dependences.  ...  The notion of CUs was inspired by our earlier work [88] , where a variation of this concept was applied to detect data races on correlated variables.  ... 
doi:10.1109/icpp.2013.119 dblp:conf/icpp/LiJW13 fatcat:6dc5s2ao4rhv7avxb4oai77hoi

Parallel Algorithm on GPU for Wireless Sensor Data Acquisition Using a Team of Unmanned Aerial Vehicles

Vincent Roberge, Mohammed Tarbouchi
2021 Sensors  
Concerned with the overall runtime of the framework, the SSSP algorithm is implemented in parallel on a graphics processing unit.  ...  The results also show the significant advantage of the parallel implementation on GPU.  ...  This step is parallelized in OpenMP on multi-core CPU using one thread per pairs. For the GA, OpenMP is also used to parallelize each step of the algorithm using one thread per candidate solution.  ... 
doi:10.3390/s21206851 pmid:34696064 fatcat:giejg3a77rh2pjedbriqowimqq

Multi-threaded ASP Solving with clasp [article]

Martin Gebser, Benjamin Kaufmann, Torsten Schaub
2012 arXiv   pre-print
Also, we provide some insights into the data representation used for different constraint types handled by clasp.  ...  We present the new multi-threaded version of the state-of-the-art answer set solver clasp.  ...  Acknowledgments We are grateful to Hannes Schröder for support with experiments and to the anonymous referees for their comments.  ... 
arXiv:1210.3265v1 fatcat:gmdhas4425bw3lzj4g6gadpgdm

Multi-threaded ASP solving with clasp

MARTIN GEBSER, BENJAMIN KAUFMANN, TORSTEN SCHAUB
2012 Theory and Practice of Logic Programming  
Also, we provide some insights into the data representation used for different constraint types handled byclasp.  ...  AbstractWe present the new multi-threaded version of the state-of-the-art answer set solverclasp.  ...  Acknowledgements We are grateful to Hannes Schröder for support with experiments and to the anonymous referees for their comments.  ... 
doi:10.1017/s1471068412000166 fatcat:nx62qlibwrbpvlcctkldki7b7y

A Tree Clock Data Structure for Causal Orderings in Concurrent Executions [article]

Umang Mathur, Andreas Pavlogiannis, Hünkar Can Tunç, Mahesh Viswanathan
2022 arXiv   pre-print
Instead of analyzing all behaviors of a program, these techniques detect errors by focusing on a single program execution.  ...  These results illustrate that tree clocks have the potential to become a standard data structure with wide applications in concurrent analyses.  ...  Race Detection and Reachability in Nearly Series- namic Race Detection. In Proceedings of the 30th ACM SIGPLAN Conference on Parallel DAGs.  ... 
arXiv:2201.06325v1 fatcat:pa7dshhvevcvlmoag4pm3ye5ua

Towards Automatic Parallelization of Stream Processing Applications

Manuel F. Dolz, David Del Rio Astorga, Javier Fernandez, J. Daniel Garcia, Jesus Carretero
2018 IEEE Access  
This framework uses a novel pipeline stage-balancing technique which provides the code generator module with the necessary information to produce balanced pipelines.  ...  A comparison study under several thread-core oversubscribed conditions reveals that the framework can bring comparable performance results with respect to the Intel TBB programming framework.  ...  [11] , detect potential parallel patterns at compile-time but require subsequent executions to find out data races and dependencies in the resulting codes.  ... 
doi:10.1109/access.2018.2855064 fatcat:3f6ovqtbkvdjdgf5zb5a6dlzt4

A visual performance analysis framework for task-based parallel applications running on hybrid clusters

Vinícius Garcia Pinto, Lucas Mello Schnorr, Luka Stanisic, Arnaud Legrand, Samuel Thibault, Vincent Danjean
2018 Concurrency and Computation  
Finally, we thank Rémy Drouilhet who provided us with an efficient Rcpp implementation of trace aggregation at a fixed granularity and Renaud Blanch who pointed us to the book of Munzner on HCI performance  ...  Some experiments were carried out at the Grid'5000 platform (https://www. grid5000.fr), with support from Inria, CNRS, RENATER and several other organizations.  ...  A somehow related approach was recently proposed to analyze OpenMP applications with grain graphs (35) .  ... 
doi:10.1002/cpe.4472 fatcat:bffj4u364feqlaf2gtn6dzlaie

Modeling optimistic concurrency using quantitative dependence analysis

Christoph von Praun, Rajesh Bordawekar, Calin Cascaval
2008 Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming - PPoPP '08  
This work presents a quantitative approach to analyze parallelization opportunities in programs with irregular memory access where potential data dependences mask available parallelism.  ...  Based on the results of our analysis, we classify applications into three categories with low, medium, and high dependence densities.  ...  We would like to thank the STAMP team at Stanford for giving us early access to the their benchmark suite.  ... 
doi:10.1145/1345206.1345234 dblp:conf/ppopp/PraunBC08 fatcat:zv47yrhmqbfzhlthcr5e7yztpi

Performance profilers and debugging tools for openMP applications

Nader Boushehrinejad Moradi
2021
We propose a novel OpenMP series-parallel graph (OSPG) that precisely captures the series-parallel relations between different fragments of the program's execution.  ...  An OpenMP program that achieves reasonable speedup on a low core count system may not achieve scalable speedup when ran on a system with a larger number of cores.  ...  "On-the-fly Data Race Detection with the Enhanced OpenMP Series-Parallel Graph" [33] , which introduces the EOSPG, an extension of the OSPG, and describes a data race detection algorithm to detect apparent  ... 
doi:10.7282/t3-0edj-bf23 fatcat:5frti7p6bbevneqesgyoquq2eu

DARPA's HPCS Program: History, Models, Tools, Languages [chapter]

Jack Dongarra, Robert Graybill, William Harrod, Robert Lucas, Ewing Lusk, Piotr Luszczek, Janice Mcmahon, Allan Snavely, Jeffrey Vetter, Katherine Yelick, Sadaf Alam, Roy Campbell (+5 others)
2008 Advances in Computers  
The historical context surrounding the birth of the DARPA High  ...  by feeding the dynamic address stream on-the-fly to a set of cache simulators, unique to each machine.  ...  All three HPCS languages have parallel semantics; that is, there is no reliance on automatic parallelism, nor are the languages purely data parallel with serial semantics, like the core of HPF.  ... 
doi:10.1016/s0065-2458(08)00001-6 fatcat:ilchf26s2fgkzlnqplxt243u7m

FireWorks: a dynamic workflow system designed for high-throughput applications

Anubhav Jain, Shyue Ping Ong, Wei Chen, Bharat Medasani, Xiaohui Qu, Michael Kocher, Miriam Brafman, Guido Petretto, Gian-Marco Rignanese, Geoffroy Hautier, Daniel Gunter, Kristin A. Persson
2015 Concurrency and Computation  
, (iii) provenance and reporting for long-running projects, (iv) automated duplicate detection, and (v) dynamic workflows (i.e., modifying the workflow graph during runtime).  ...  It has been designed to serve the demanding high-throughput computing needs of these applications, with extensive support for (i) concurrent execution through job packing, (ii) failure detection and correction  ...  For example, one can execute data-parallel projects by defining Workflows containing the same FireTasks but with different input data in the spec.  ... 
doi:10.1002/cpe.3505 fatcat:ol3fgtdr6bhdvdsjiecjvnqkh4

PolyCheck: dynamic verification of iteration space transformations on affine programs

Wenlei Bao, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, P. Sadayappan
2016 Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages - POPL 2016  
transformations, with space consumption proportional to the original program data and robust to arbitrary datasets of a given size.  ...  High-level compiler transformations, especially loop transformations, are widely recognized as critical optimizations to restructure programs to improve data locality and expose parallelism.  ...  Acknowledgments We thank the anonymous referees for the feedback and many suggestions that helped us significantly in improving the presentation of the work.  ... 
doi:10.1145/2837614.2837656 dblp:conf/popl/BaoKPRS16 fatcat:dmeyobfderamrl6heu2vs6zu3i

Concurrent Computing in the Many-core Era (Dagstuhl Seminar 15021)

Michael Philippsen, Pascal Felber, Michael L. Scott, J. Eliot B. Moss, Marc Herbstritt
2015 Dagstuhl Reports  
The current seminar built on the previous seminars by notably (1) broadening the scope to concurrency beyond transactional memory and shared-memory multicores abstractions, (2) focusing on the new challenges  ...  This report documents the program and the outcomes of Dagstuhl Seminar 15021 "Concurrent computing in the many-core era".  ...  Strongly atomic in the absence of data races.  ... 
doi:10.4230/dagrep.5.1.1 dblp:journals/dagstuhl-reports/PhilippsenFSM15 fatcat:owcmta65hzb5vmglwq3dwzbehy

29th International Conference on Data Engineering [book of abstracts]

2013 2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW)  
In this paper, we propose a new framework of processing kGPM with on-the-fly ranked lists based on spanning trees of the cyclic graph query.  ...  IPA are also easily parallelized by simply adding a few lines of OpenMP meta-programming expressions.  ...  These volunteers welcome participants, give directions, help in the sessions and on the registration desk, and generally make sure the conference is running smoothly.  ... 
doi:10.1109/icdew.2013.6547409 fatcat:wadzpuh3b5htli4mgb4jreoika
« Previous Showing results 1 — 15 out of 77 results