Filters








1,221 Hits in 4.1 sec

Static Detection of Access Anomalies in Ada95 [chapter]

Bernd Burgstaller, Johann Blieberger, Robert Mittermayr
2006 Lecture Notes in Computer Science  
In this paper we present data flow frameworks that are able to detect access anomalies in Ada multi-tasking programs.  ...  In particular, our approach finds all possible non-sequential accesses to shared non-protected variables. The algorithms employed are very efficient.  ...  In [11] the dynamic approach is further explored, nested parallel loops are considered, and experimental results are given. The retrospective in [24] gives a good survey of on-the-fly techniques.  ... 
doi:10.1007/11767077_4 fatcat:7suckrssdjblnlxsymjv5tmbsq

An empirical comparison of monitoring algorithms for access anomaly detection

A. Dinning, E. Schonberg
1990 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming - PPOPP '90  
(ii) Task recycling is more effcient in terms of space requirements and often in performance. (iii) The general approach of monitoring to detect access anomalies is practical.  ...  The program segment in Figure 1 illustrates a,n access anomaly. The doall construct creates two parallel threads that both write the variable X.  ...  Nevertheless, we not believe the cost it too prohibitive for on-the-fly anomaly detection to be a useful debugging tool.  ... 
doi:10.1145/99163.99165 dblp:conf/ppopp/DinningS90 fatcat:kdcjymaun5at5evxbfpjvbffmi

On-the-fly detection of data races for programs with nested fork-join parallelism

John Mellor-Crummey
1991 Proceedings of the 1991 ACM/IEEE conference on Supercomputing - Supercomputing '91  
The worst-case space required by our protocol when monitoring an execution of a program P is O(V N), where V is the number of shared variables in P , and N is the maximum dynamic nesting of parallel constructs  ...  This paper presents a new protocol for run-time detection of data races in executions of shared-memory programs with nested fork-join parallelism and no other interthread synchronization.  ...  Robert Hood and Seema Hiranandani participated in early discussions of these ideas. Robert Hood implemented the prototype dependencebased instrumentation system.  ... 
doi:10.1145/125826.125861 dblp:conf/sc/Mellor-Crummey91 fatcat:aiyicrbrs5e3dernwtbsnj6u6u

Run-time parallelization: Its time has come

Lawrence Rauchwerger
1998 Parallel Computing  
The technique of speculatively parallelizing doall loops is presented in more detail.  ...  A survey of the various approaches to parallelizing partially parallel loops and fully parallel loops is presented.  ...  Generally, access anomaly detection techniques seek to identify the point in the parallel execution at which the access anomaly occurred. Padua et al.  ... 
doi:10.1016/s0167-8191(98)00024-6 fatcat:tovbe2cdbfhdjd4bdzxw2lzp6u

The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization

L. Rauchwerger, D.A. Padua
1999 IEEE Transactions on Parallel and Distributed Systems  
Since, from our experience, a significant amount of the available parallelism in Fortran programs can be exploited by loops transformed through privatization and reduction parallelization, our methods  ...  Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns.  ...  A preliminary version of this paper appeared in Proceedings of the SIG-PLAN 1995 Conference on Programming Language Design and Implementation, pages 218-232, June 1995.  ... 
doi:10.1109/71.752782 fatcat:wsjtf7kievftjdzsgmvckq7y2m

Large-Scale Directed Model Checking LTL [chapter]

Stefan Edelkamp, Shahid Jabbar
2006 Lecture Notes in Computer Science  
In this paper we propose an external, distributed and directed on-the-fly model checking algorithm to check general LTL properties in the model checker SPIN.  ...  Hoare, Prentice-Hall 1994, which also introduces to the idea of external model checking for the FDR system. Unfortunately, we haven't been able to access the reference.  ...  The main challenge for distributed and external on-the-fly model checking is that the depth-first traversal of the global state space graph as used in Nested-DFS (an on-the-fly variant of Tarjan's algorithm  ... 
doi:10.1007/11691617_1 fatcat:muf2lllww5dsnmw7t2go2l2ayi

A Performance Study of a Dual Xeon-Phi Cluster for the Forward Modelling of Gravitational Fields

Maricela Arroyo, Carlos Couder-Castañeda, Alfredo Trujillo-Alcantara, Israel-Enrique Herrera-Diaz, Nain Vera-Chavez
2015 Scientific Programming  
This research shows an efficient strategy based on nested parallelism using OpenMP, a design that in its outer structure acts as a controller of interconnected Xeon-Phi coprocessors while its interior  ...  is used for parallelyzing the loops.  ...  Alfredo Trujillo-Alcantara wants to thank the support, in order to publish the results, provided by the program SENER-CONACYT and the hospitality of the doctoral program of the ESFM-IPN (http://esfm.ipn.mx  ... 
doi:10.1155/2015/316012 fatcat:iluaqecwungstejsx77apz3fge

Memory Trace Compression and Replay for SPMD Systems using Extended PRSDs?

Sandeep Budanur, Frank Mueller, Todd Gamblin
2011 Performance Evaluation Review  
Concurrency levels in large-scale supercomputers are rising exponentially, and shared-memory nodes with hundreds of cores and non-uniform memory access latencies are expected within the next decade.  ...  We further introduce a replay mechanism for ScalaMemTrace traces, and discuss the results of our prototype implementation on the x86 64 architecture.  ...  The anomalies during function boundary detection are rare and do not adversely affect the compression process. We detail the speedup of this approach in the results section.  ... 
doi:10.1145/1964218.1964224 fatcat:eh5jsc6c4jdejaonxombj5om3i

Memory Trace Compression and Replay for SPMD Systems Using Extended PRSDs

S. Budanur, F. Mueller, T. Gamblin
2011 Computer journal  
Concurrency levels in large-scale supercomputers are rising exponentially, and shared-memory nodes with hundreds of cores and non-uniform memory access latencies are expected within the next decade.  ...  We further introduce a replay mechanism for ScalaMemTrace traces, and discuss the results of our prototype implementation on the x86 64 architecture.  ...  The anomalies during function boundary detection are rare and do not adversely affect the compression process. We detail the speedup of this approach in the results section.  ... 
doi:10.1093/comjnl/bxr071 fatcat:lx5lge73nfgpfl45bpkb7jhffu

Massively parallel regularized 3D inversion of potential fields on CPUs and GPUs

Martin Čuma, Michael S. Zhdanov
2014 Computers & Geosciences  
In this paper we explain the features that made this massive parallelization feasible and extend the code to add GPU support in the form of the OpenACC directives.  ...  This implementation resulted in up to a 22x speedup as compared to the scalar multithreaded implementation on a 12 core Intel CPU based computer node.  ...  Acknowledgments We acknowledge support of the University of Utah's Center for High Performance Computing (CHPC). the Consortium for Electromagnetic Modeling and Inversion (CEMI), and TechnoImaging.  ... 
doi:10.1016/j.cageo.2013.10.004 fatcat:tb6szqwh6jbevg7dc62byvhrlm

ScalaBenchGen: Auto-Generation of Communication Benchmarks Traces

Xing Wu, Vivek Deshpande, Frank Mueller
2012 2012 IEEE 26th International Parallel and Distributed Processing Symposium  
Experimental results demonstrate that generated source code of benchmarks preserves both the communication patterns and the wallclocktime behavior of the original application.  ...  As a result, benchmarks tend to lag behind the development of complex scientific codes. This work contributes an automated approach to the creation of communication benchmarks.  ...  N −1 in each per-node trace. ScalaTrace then detects the loop structure and outputs a single PRSD to denote a single loop of 100 iterations.  ... 
doi:10.1109/ipdps.2012.114 dblp:conf/ipps/WuDM12 fatcat:tybwvddkkfesjeg7zran4awziq

Bolt

Michael Kling, Sasa Misailovic, Michael Carbin, Martin Rinard
2012 SIGPLAN notices  
Bolt can detect and escape from loops in off-theshelf software, without available source code, and with no overhead in standard production use.  ...  Bolt operates on stripped x86 and x64 binaries, dynamically attaches and detaches to and from the program as needed, and dynamically detects loops and creates program state checkpoints to enable exploration  ...  Acknowledgements We would like the thank Deokhwan Kim, Stelios Sidiroglou, and the anonymous reviewers for their useful feedback and comments. We note our previous work on this topic [32] .  ... 
doi:10.1145/2398857.2384648 fatcat:spx6wb6d3fdvvlused5glriz7e

Bolt

Michael Kling, Sasa Misailovic, Michael Carbin, Martin Rinard
2012 Proceedings of the ACM international conference on Object oriented programming systems languages and applications - OOPSLA '12  
Bolt can detect and escape from loops in off-theshelf software, without available source code, and with no overhead in standard production use.  ...  Bolt operates on stripped x86 and x64 binaries, dynamically attaches and detaches to and from the program as needed, and dynamically detects loops and creates program state checkpoints to enable exploration  ...  Acknowledgements We would like the thank Deokhwan Kim, Stelios Sidiroglou, and the anonymous reviewers for their useful feedback and comments. We note our previous work on this topic [32] .  ... 
doi:10.1145/2384616.2384648 dblp:conf/oopsla/KlingMCR12 fatcat:n363jey2rvagfnlfitd3yha7om

Scalable I/O tracing and analysis

Karthik Vijayakumar, Frank Mueller, Xiaosong Ma, Philip C. Roth
2009 Proceedings of the 4th Annual Workshop on Petascale Data Storage - PDSW '09  
Statistical information gathered reveals insight on the number of I/O and communication calls issued in the POP and FLASH I/O.  ...  Our contributions also include automated trace analysis to collect selected statistical information of I/O calls by parsing the compressed trace on-the-fly and time-accurate replay of communication events  ...  It was also sponsored in part by the Office of Advanced Scientific Computing Research; U.S. Department of Energy.  ... 
doi:10.1145/1713072.1713080 fatcat:bfzn3uzq4fbf3pjojedrzdsgbq

SI-TM

Heiner Litz, David Cheriton, Amin Firoozshahian, Omid Azizi, John P. Stevenson
2014 Proceedings of the 19th international conference on Architectural support for programming languages and operating systems - ASPLOS '14  
We show that snapshot isolation can reduce the number of aborts in some cases by three orders of magnitude and improve performance by up to 20x.  ...  In this paper, we investigate snapshot isolation transactional memory in which transactions operate on memory snapshots that always guarantee consistent reads.  ...  We are grateful to Timothy Harris, Michael Chan, Ricardo Dias, Tor Aamodt, Stephan Diestelhorst and the anonymous reviewers for their useful feedback on earlier versions of this manuscript.  ... 
doi:10.1145/2541940.2541952 dblp:conf/asplos/LitzCFAS14 fatcat:er7rsyd4f5cs5i3irx6ijek6bq
« Previous Showing results 1 — 15 out of 1,221 results