A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2006; you can also visit the original URL.
The file type is application/pdf
.
Filters
Static Detection of Access Anomalies in Ada95
[chapter]
2006
Lecture Notes in Computer Science
In this paper we present data flow frameworks that are able to detect access anomalies in Ada multi-tasking programs. ...
In particular, our approach finds all possible non-sequential accesses to shared non-protected variables. The algorithms employed are very efficient. ...
In [11] the dynamic approach is further explored, nested parallel loops are considered, and experimental results are given. The retrospective in [24] gives a good survey of on-the-fly techniques. ...
doi:10.1007/11767077_4
fatcat:7suckrssdjblnlxsymjv5tmbsq
An empirical comparison of monitoring algorithms for access anomaly detection
1990
Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming - PPOPP '90
(ii) Task recycling is more effcient in terms of space requirements and often in performance. (iii) The general approach of monitoring to detect access anomalies is practical. ...
The program segment in Figure 1 illustrates a,n access anomaly. The doall construct creates two parallel threads that both write the variable X. ...
Nevertheless, we not believe the cost it too prohibitive for on-the-fly anomaly detection to be a useful debugging tool. ...
doi:10.1145/99163.99165
dblp:conf/ppopp/DinningS90
fatcat:kdcjymaun5at5evxbfpjvbffmi
On-the-fly detection of data races for programs with nested fork-join parallelism
1991
Proceedings of the 1991 ACM/IEEE conference on Supercomputing - Supercomputing '91
The worst-case space required by our protocol when monitoring an execution of a program P is O(V N), where V is the number of shared variables in P , and N is the maximum dynamic nesting of parallel constructs ...
This paper presents a new protocol for run-time detection of data races in executions of shared-memory programs with nested fork-join parallelism and no other interthread synchronization. ...
Robert Hood and Seema Hiranandani participated in early discussions of these ideas. Robert Hood implemented the prototype dependencebased instrumentation system. ...
doi:10.1145/125826.125861
dblp:conf/sc/Mellor-Crummey91
fatcat:aiyicrbrs5e3dernwtbsnj6u6u
Run-time parallelization: Its time has come
1998
Parallel Computing
The technique of speculatively parallelizing doall loops is presented in more detail. ...
A survey of the various approaches to parallelizing partially parallel loops and fully parallel loops is presented. ...
Generally, access anomaly detection techniques seek to identify the point in the parallel execution at which the access anomaly occurred. Padua et al. ...
doi:10.1016/s0167-8191(98)00024-6
fatcat:tovbe2cdbfhdjd4bdzxw2lzp6u
The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization
1999
IEEE Transactions on Parallel and Distributed Systems
Since, from our experience, a significant amount of the available parallelism in Fortran programs can be exploited by loops transformed through privatization and reduction parallelization, our methods ...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. ...
A preliminary version of this paper appeared in Proceedings of the SIG-PLAN 1995 Conference on Programming Language Design and Implementation, pages 218-232, June 1995. ...
doi:10.1109/71.752782
fatcat:wsjtf7kievftjdzsgmvckq7y2m
Large-Scale Directed Model Checking LTL
[chapter]
2006
Lecture Notes in Computer Science
In this paper we propose an external, distributed and directed on-the-fly model checking algorithm to check general LTL properties in the model checker SPIN. ...
Hoare, Prentice-Hall 1994, which also introduces to the idea of external model checking for the FDR system. Unfortunately, we haven't been able to access the reference. ...
The main challenge for distributed and external on-the-fly model checking is that the depth-first traversal of the global state space graph as used in Nested-DFS (an on-the-fly variant of Tarjan's algorithm ...
doi:10.1007/11691617_1
fatcat:muf2lllww5dsnmw7t2go2l2ayi
A Performance Study of a Dual Xeon-Phi Cluster for the Forward Modelling of Gravitational Fields
2015
Scientific Programming
This research shows an efficient strategy based on nested parallelism using OpenMP, a design that in its outer structure acts as a controller of interconnected Xeon-Phi coprocessors while its interior ...
is used for parallelyzing the loops. ...
Alfredo Trujillo-Alcantara wants to thank the support, in order to publish the results, provided by the program SENER-CONACYT and the hospitality of the doctoral program of the ESFM-IPN (http://esfm.ipn.mx ...
doi:10.1155/2015/316012
fatcat:iluaqecwungstejsx77apz3fge
Memory Trace Compression and Replay for SPMD Systems using Extended PRSDs?
2011
Performance Evaluation Review
Concurrency levels in large-scale supercomputers are rising exponentially, and shared-memory nodes with hundreds of cores and non-uniform memory access latencies are expected within the next decade. ...
We further introduce a replay mechanism for ScalaMemTrace traces, and discuss the results of our prototype implementation on the x86 64 architecture. ...
The anomalies during function boundary detection are rare and do not adversely affect the compression process. We detail the speedup of this approach in the results section. ...
doi:10.1145/1964218.1964224
fatcat:eh5jsc6c4jdejaonxombj5om3i
Memory Trace Compression and Replay for SPMD Systems Using Extended PRSDs
2011
Computer journal
Concurrency levels in large-scale supercomputers are rising exponentially, and shared-memory nodes with hundreds of cores and non-uniform memory access latencies are expected within the next decade. ...
We further introduce a replay mechanism for ScalaMemTrace traces, and discuss the results of our prototype implementation on the x86 64 architecture. ...
The anomalies during function boundary detection are rare and do not adversely affect the compression process. We detail the speedup of this approach in the results section. ...
doi:10.1093/comjnl/bxr071
fatcat:lx5lge73nfgpfl45bpkb7jhffu
Massively parallel regularized 3D inversion of potential fields on CPUs and GPUs
2014
Computers & Geosciences
In this paper we explain the features that made this massive parallelization feasible and extend the code to add GPU support in the form of the OpenACC directives. ...
This implementation resulted in up to a 22x speedup as compared to the scalar multithreaded implementation on a 12 core Intel CPU based computer node. ...
Acknowledgments We acknowledge support of the University of Utah's Center for High Performance Computing (CHPC). the Consortium for Electromagnetic Modeling and Inversion (CEMI), and TechnoImaging. ...
doi:10.1016/j.cageo.2013.10.004
fatcat:tb6szqwh6jbevg7dc62byvhrlm
ScalaBenchGen: Auto-Generation of Communication Benchmarks Traces
2012
2012 IEEE 26th International Parallel and Distributed Processing Symposium
Experimental results demonstrate that generated source code of benchmarks preserves both the communication patterns and the wallclocktime behavior of the original application. ...
As a result, benchmarks tend to lag behind the development of complex scientific codes. This work contributes an automated approach to the creation of communication benchmarks. ...
N −1 in each per-node trace. ScalaTrace then detects the loop structure and outputs a single PRSD to denote a single loop of 100 iterations. ...
doi:10.1109/ipdps.2012.114
dblp:conf/ipps/WuDM12
fatcat:tybwvddkkfesjeg7zran4awziq
Bolt
2012
SIGPLAN notices
Bolt can detect and escape from loops in off-theshelf software, without available source code, and with no overhead in standard production use. ...
Bolt operates on stripped x86 and x64 binaries, dynamically attaches and detaches to and from the program as needed, and dynamically detects loops and creates program state checkpoints to enable exploration ...
Acknowledgements We would like the thank Deokhwan Kim, Stelios Sidiroglou, and the anonymous reviewers for their useful feedback and comments. We note our previous work on this topic [32] . ...
doi:10.1145/2398857.2384648
fatcat:spx6wb6d3fdvvlused5glriz7e
Bolt can detect and escape from loops in off-theshelf software, without available source code, and with no overhead in standard production use. ...
Bolt operates on stripped x86 and x64 binaries, dynamically attaches and detaches to and from the program as needed, and dynamically detects loops and creates program state checkpoints to enable exploration ...
Acknowledgements We would like the thank Deokhwan Kim, Stelios Sidiroglou, and the anonymous reviewers for their useful feedback and comments. We note our previous work on this topic [32] . ...
doi:10.1145/2384616.2384648
dblp:conf/oopsla/KlingMCR12
fatcat:n363jey2rvagfnlfitd3yha7om
Scalable I/O tracing and analysis
2009
Proceedings of the 4th Annual Workshop on Petascale Data Storage - PDSW '09
Statistical information gathered reveals insight on the number of I/O and communication calls issued in the POP and FLASH I/O. ...
Our contributions also include automated trace analysis to collect selected statistical information of I/O calls by parsing the compressed trace on-the-fly and time-accurate replay of communication events ...
It was also sponsored in part by the Office of Advanced Scientific Computing Research; U.S. Department of Energy. ...
doi:10.1145/1713072.1713080
fatcat:bfzn3uzq4fbf3pjojedrzdsgbq
We show that snapshot isolation can reduce the number of aborts in some cases by three orders of magnitude and improve performance by up to 20x. ...
In this paper, we investigate snapshot isolation transactional memory in which transactions operate on memory snapshots that always guarantee consistent reads. ...
We are grateful to Timothy Harris, Michael Chan, Ricardo Dias, Tor Aamodt, Stephan Diestelhorst and the anonymous reviewers for their useful feedback on earlier versions of this manuscript. ...
doi:10.1145/2541940.2541952
dblp:conf/asplos/LitzCFAS14
fatcat:er7rsyd4f5cs5i3irx6ijek6bq
« Previous
Showing results 1 — 15 out of 1,221 results