Filters








381 Hits in 2.8 sec

SyncChecker: Detecting Synchronization Errors between MPI Applications and Libraries

Zhezhe Chen, Xinyu Li, Jau-Yuan Chen, Hua Zhong, Feng Qin
2012 2012 IEEE 26th International Parallel and Distributed Processing Symposium  
After initiating nonblocking communication and performing overlapped computation, the MPI application reuses the message buffer before the MPI library completes the use of the same buffer, which may lead  ...  Then it checks whether the correct execution order between the MPI application and the MPI library is enforced by the MPI completion check routines.  ...  This work was also partially sponsored by the National Natural Science Foundation of China under Grant No. 61173004.  ... 
doi:10.1109/ipdps.2012.40 dblp:conf/ipps/ChenLCZQ12 fatcat:po45ukbaazd77fojg6cwnk6cbm

The MPI Bugs Initiative: a Framework for MPI Verification Tools Evaluation

Mathieu Laurent, Emmanuelle Saillard, Martin Quinson
2021 2021 IEEE/ACM 5th International Workshop on Software Correctness for HPC Applications (Correctness)  
In this paper, we present the MPI BUGS INITIATIVE, a complete collection of MPI codes to assess the status of MPI verification tools.  ...  We introduce a classification of MPI errors and provide correct and incorrect codes covering many MPI features and our categorization of errors.  ...  PARallel COntrol flow Anomaly CHecker (PARCOACH) combines a static analysis with a code instrumentation to detect misuse of MPI collectives [34] as well as nonblocking and persistent communications  ... 
doi:10.1109/correctness54621.2021.00008 fatcat:ogtf4tmuv5aclfv7txs5oaaqym

Correctness Analysis of MPI-3 Non-Blocking Communications in PARCOACH

Julien Jaeger, Emmanuelle Saillard, Patrick Carribault, Denis Barthou
2015 Proceedings of the 22nd European MPI Users' Group Meeting on ZZZ - EuroMPI '15  
MPI-3 provide functions for non-blocking collectives.  ...  These enhancements focus on correct call sequences of all flavor of collective calls, and on the presence of completion calls for all nonblocking communications.  ...  These analyses have been implemented inside PARCOACH [3] , a tool proposing a twophase analysis to detect incorrect collective patterns in MPI programs.  ... 
doi:10.1145/2802658.2802674 dblp:conf/pvm/JaegerSCB15 fatcat:utwcbafa2bbebf54mlzyqoxtz4

MPI Runtime Error Detection with MUST: Advances in Deadlock Detection

Tobias Hilbrich, Joachim Protze, Martin Schulz, Bronis R. de Supinski, Matthias S. Müller
2013 Scientific Programming  
Finally, we present optimizations for the processing of MPI operations that reduce runtime deadlock detection overheads.  ...  We empirically observe that our improvements lead to sub-linear or better analysis time per operation for a wide range of real world applications.  ...  Some constructs in MPI can also lead to such cases, including MPI_Sendrecv, multithreaded MPI applications, and, in MPI-3, nonblocking collectives [12] . A.  ... 
doi:10.1155/2013/314971 fatcat:wogkn6qqrjevbpe5httfensz7i

MPI runtime error detection with MUST: Advances in deadlock detection

Tobias Hilbrich, Joachim Protze, Martin Schulz, Bronis R. de Supinski, Matthias S. Muller
2012 2012 International Conference for High Performance Computing, Networking, Storage and Analysis  
Finally, we present optimizations for the processing of MPI operations that reduce runtime deadlock detection overheads.  ...  Existing approaches could require O(p) analysis time per MPI operation, for p processes, where our improvements lead to an O(log p) complexity or better for real world applications.  ...  such as nonblocking collectives.  ... 
doi:10.1109/sc.2012.79 dblp:conf/sc/HilbrichPSSM12 fatcat:gkeoflbcarcork7jdtlbdvk3ey

Formal Analysis of Message Passing [chapter]

Stephen F. Siegel, Ganesh Gopalakrishnan
2011 Lecture Notes in Computer Science  
The MPI runtime system instantiates n processes from this code.  ...  MPI does provide for dynamic process creation, but this feature is not widely used, and we will assume n is fixed for the runtime of the program.  ...  However, this is the only source of nondeterminism arising from the use of collectives.) Nonblocking Operations.  ... 
doi:10.1007/978-3-642-18275-4_2 fatcat:vmr66hc24zgtjps5eo2n7qlk2a

PARCOACH Extension for Static MPI Nonblocking and Persistent Communication Validation

Van Man Nguyen, Emmanuelle Saillard, Julien Jaeger, Denis Barthou, Patrick Carribault
2020 2020 IEEE/ACM 4th International Workshop on Software Correctness for HPC Applications (Correctness)  
In this paper we present an extension of PARCOACH static analysis to detect misuse of MPI nonblocking and persistent communications.  ...  PARCOACH is a framework that detects MPI collective errors using a static/dynamic analysis.  ...  PARCOACH [1] - [3] is a framework built on top of LLVM to detect misuse of collectives in MPI programs.  ... 
doi:10.1109/correctness51934.2020.00009 fatcat:42j6pecrwjgvfmg4nokffkrpma

Maximizing Communication Overlap with Dynamic Program Analysis

Emmanuelle Saillard, Koushik Sen, Wim Lavrijsen, Costin Iancu
2018 Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region - HPC Asia 2018  
An offline analysis determines the program optimal points for maximal overlap when considering several programming constructs: nonblocking one-sided communication operations, nonblocking collectives and  ...  The value of our approach comes from: 1) the ability to optimize across boundaries of software modules or libraries, while specializing for the intrinsics of the underlying communication runtime; and 2  ...  Department of Energy, Office of Science, Advanced Scientific Computing Research under collaborative agreement numbers DE-SC0008699.  ... 
doi:10.1145/3149457.3149459 dblp:conf/hpcasia/SaillardSLI18 fatcat:2e74indkbnhlvol7nsu3z7t3ai

Performance analysis of parallel programs via message-passing graph traversal

M.J. Sottile, V.P. Chandu, D.A. Bader
2006 Proceedings 20th IEEE International Parallel & Distributed Processing Symposium  
This analysis provides a quantitative description of the sensitivity of applications to a variety of performance parameters to better understand the range of systems upon which an application can be expected  ...  We propose a methodology for analyzing the performance characteristics of parallel programs based on message-passing traces of their execution on a set of processors.  ...  Correctness Correctness of the graph and its modification during the analysis process is vital.  ... 
doi:10.1109/ipdps.2006.1639321 dblp:conf/ipps/SottileCB06 fatcat:rhcpmtqceveshnodjwynqaunea

Exploitation of Dynamic Communication Patterns through Static Analysis

Robert Preissl, Bronis R. de Supinski, Martin Schulz, Daniel J. Quinlan, Dieter Kranzlmuller, Thomas Panas
2010 2010 39th International Conference on Parallel Processing  
such as nonblocking collective operations.  ...  Our approach combines dynamic and static analysis techniques to identify common collective communication patterns expressed as point-to-point calls and transforms them into equivalent MPI collectives.  ...  STAR MPI [3] automatically finds an optimal communication topology for existing MPI collectives to match the characteristics of the application and the machine at runtime, while other projects [5] ,  ... 
doi:10.1109/icpp.2010.14 dblp:conf/icpp/PreisslSSQKP10 fatcat:5ngzy2npvbcmhfl3kt43gojxjq

Optimizing blocking and nonblocking reduction operations for multicore systems: Hierarchical design and implementation

Manjunath Gorentla Venkata, Pavel Shamis, Rahul Sampath, Richard L. Graham, Joshua S. Ladd
2013 2013 IEEE International Conference on Cluster Computing (CLUSTER)  
Many scientific simulations, using the Message Passing Interface (MPI) programming model, are sensitive to the performance and scalability of reduction collective operations such as MPI Allreduce and MPI  ...  configure the depth of hierarchy to match the system architecture, and 3) providing the ability to independently progress each of this hierarchy.  ...  ACKNOWLEDGMENT This research used resources of the Center for Computational Sciences at Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S.  ... 
doi:10.1109/cluster.2013.6702676 dblp:conf/cluster/VenkataSSGL13 fatcat:bvos4mi7zffyxhw74a2r4cqpsq

PARCOACH Extension for a Full-Interprocedural Collectives Verification

Pierre Huchant, Emmanuelle Saillard, Denis Barthou, Hugo Brunie, Patrick Carribault
2018 2018 IEEE/ACM 2nd International Workshop on Software Correctness for HPC Applications (Correctness)  
PARCOACH is independent from MPI implementation and handles all blocking and nonblocking collectives.  ...  PARCOACH only checks if the sequence of collectives is deterministic and supposes all MPI blocking and nonblocking collectives are called with compatible arguments.  ... 
doi:10.1109/correctness.2018.00013 dblp:conf/sc/HuchantSBBC18 fatcat:gaxoiaezsvegvmsiman3cj522u

Optimization of FASTEST-3D for Modern Multicore Systems [article]

Christoph Scheit, Georg Hager, Jan Treibig, Stefan Becker, Gerhard Wellein
2013 arXiv   pre-print
FASTEST-3D is an MPI-parallel finite-volume flow solver based on block-structured meshes that has been developed at the University of Erlangen-Nuremberg since the early 1990s.  ...  Up to now its scalability was strongly limited by a rather rigid communication infrastructure, which led to a dominance of MPI time already at small process counts.  ...  time-line for a short time interval (c) Distribution of communication time on different MPI functions Fig. 3.  ... 
arXiv:1303.4538v1 fatcat:jjyci5ss3vb67cv7qopusq3y5i

Scalable communication protocols for dynamic sparse data exchange

Torsten Hoefler, Christian Siebert, Andrew Lumsdaine
2010 Proceedings of the 15th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '10  
Algorithm N BX improves the runtime of a sparse dataexchange among 8,192 processors on BlueGene/P by a factor of 5.6.  ...  As a result, communication phases are typically expressed explicitly using point-to-point communication operations or collective operations.  ...  Rich Graham (ORNL), Terry Jones (ORNL), and Rajeev Thakur (ANL) helped us to get access to the necessary supercomputer resources and one of the authors was supported by the Department of Energy project  ... 
doi:10.1145/1693453.1693476 dblp:conf/ppopp/HoeflerSL10 fatcat:y4acnjusgfd7fad6lb7vypzua4

Scalable communication protocols for dynamic sparse data exchange

Torsten Hoefler, Christian Siebert, Andrew Lumsdaine
2010 SIGPLAN notices  
Algorithm N BX improves the runtime of a sparse dataexchange among 8,192 processors on BlueGene/P by a factor of 5.6.  ...  As a result, communication phases are typically expressed explicitly using point-to-point communication operations or collective operations.  ...  Rich Graham (ORNL), Terry Jones (ORNL), and Rajeev Thakur (ANL) helped us to get access to the necessary supercomputer resources and one of the authors was supported by the Department of Energy project  ... 
doi:10.1145/1837853.1693476 fatcat:pkrs4kt7xvcdjpjgkifduktnc4
« Previous Showing results 1 — 15 out of 381 results