Filters








439 Hits in 3.0 sec

MIL: A language to build program analysis tools through static binary instrumentation

Andres S. Charif-Rubial, Denis Barthou, Cedric Valensi, Sameer Shende, Allen Malony, William Jalby
2013 20th Annual International Conference on High Performance Computing  
The key feature of MIL is to ease the integration of static, global program analysis with instrumentation.  ...  In this paper, we propose a language, MIL, for the development of program analysis tools based on static binary instrumentation.  ...  not require debug symbols and can be used as a fallback).  ... 
doi:10.1109/hipc.2013.6799106 dblp:conf/hipc/RubialBVSMJ13 fatcat:5fydxgw5oraxlabc6j2vs4ujh4

A Survey of Performance Analysis Tools for OpenMP and MPI

J. Sairabanu, M. Rajasekhara Babu, Arunava Kar, Aritra Basu
2016 Indian Journal of Science and Technology  
These are widely used to provide different performance characteristics of parallelism in different test cases.  ...  program.  ...  OpenMP OpenMP serves as a standard for shared-memory parallel programming.  ... 
doi:10.17485/ijst/2016/v9i43/91712 fatcat:siezkdrcgbf2zixetkwkui7e3e

PACHA: Low Cost Bare Metal Development for Shared Memory Manycore Accelerators

Alexandre Aminot, Alexandre Guerre, Julien Peeters, Yves Lhuillier
2013 Procedia Computer Science  
A case study on a TILEPro64 is presented: the performance gain using PACHA rather than Linux with OpenMP or Pthread is about 1,8x to 4x, without increasing the development cost.  ...  With a x86 support and a Linux compatibility, PACHA offers a functional simulator and all the Linux set of debugging tools.  ...  Note that using generic types forces the user to manipulate these generic data only with the functions provided by the programming interface.  ... 
doi:10.1016/j.procs.2013.05.332 fatcat:nycqbhmfjjelddfigpkqnwmnzi

Automated Bug Detection for High-level Synthesis of Multi-threaded Irregular Applications

Pietro Fezzardi, Fabrizio Ferrandi
2020 ACM Transactions on Parallel Computing  
The proposed debug flow was integrated in an existing open-source HLS framework and evaluated on a set of benchmarks using OpenMP parallelization directives.  ...  High-Level Synthesis (HLS) of multi-threaded parallel programs is increasingly used to extract parallelism.  ...  Second, because it is representative of the real current use of the OpenMP programming model on FPGA.  ... 
doi:10.1145/3418086 fatcat:ixgetytmtjai7env263rt2kmy4

Software Development for Parallel and Multi-Core Processing [chapter]

Kenn R.
2012 Embedded Systems - High Performance Systems, Applications and Projects  
Many users and vendors who initially used HPF have migrated to OpenMPOpenMP Fortran which is an extension to Fortran 95.  ...  The tracing tools can track thread migration between cores, scheduling events, and other information useful for maximizing core utilization.  An SMP approach is best for a larger number of cores and for  ... 
doi:10.5772/38261 fatcat:3zwgizqp3vgqfa2hn6gyxmuw44

Hands-on Practical Hybrid Parallel Application Performance Engineering [chapter]

Markus Geimer, Michael Gerndt, Sameer Shende, Bert Wesarg, Brian Wylie
2012 Lecture Notes in Computer Science  
TAU_TRACK_SIGNALS 0 Setting to 1 generate debugging callstack info when a program crashes TAU_COMM_MATRIX 0 Setting to 1 generates communication matrix display using context events TAU_THROTTLE  ...  = -fopenmp # GCC compiler OPENMP = -openmp #Intel compiler ... #---------------------------------------------------------------------# The Fortran compiler used for MPI programs #---------------------  ... 
doi:10.1007/978-3-642-33518-1_6 fatcat:6vtbimvcmzddnmly3nuxtrth6e

An LLVM Instrumentation Plug-in for Score-P

Ronny Tschüter, Johannes Ziegenbalg, Bert Wesarg, Matthias Weber, Christian Herold, Sebastian Döbel, Ronny Brendel
2017 Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC - LLVM-HPC'17  
ACKNOWLEDGMENTS This research used resources of the Oak Ridge Leadership Computing Facility at Oak Ridge National Laboratory, which is supported by the Office of Science of the Department of Energy under  ...  The code consists of an OpenMP and a MPI+OpenMP version. In our experiments we use the OpenMP version of miniFE.  ...  The experiments use the parallelization paradigms Message Passing Interface (MPI) [9] and OpenMP [4] to distribute the workload over multiple Comparison of event sequences.  ... 
doi:10.1145/3148173.3148187 dblp:conf/sc/TschuterZWWHDB17 fatcat:sifwgfm7kve37m4y66dh7tftcu

Source-Code-Correlated Cache Coherence Characterization of OpenMP Benchmarks

Jaydeep Marathe, Frank Mueller
2007 IEEE Transactions on Parallel and Distributed Systems  
We provide tool support to extract these reference traces and synchronization information from OpenMP threads at run-time using dynamic binary rewriting of the application executable.  ...  In many cases, aggregate events provide insufficient information for programmers to understand and optimize the coherence behavior of their applications.  ...  As the program executes, the handler functions get invoked, generating an event trace (memory accesses, function entry/exits and OpenMP synchronization 1 The debug information embedded in the executable  ... 
doi:10.1109/tpds.2007.1058 fatcat:5fhv5hflfjamlfn4rzrtvobrmq

A Survey: Runtime Software Systems for High Performance Computing

2017 Supercomputing Frontiers and Innovations  
Conventional practices employ message-passing programming interfaces; sometimes combining thread-based shared memory interfaces such as OpenMP.  ...  This is useful when the variable is manipulated within the meta-data of a structure such a graph but the actual value is not used.  ...  This is not absolutely the case as some modest amount of runtime control has been employed even for the widely used programming interfaces of both OpenMP and MPI.  ... 
doi:10.14529/jsfi170103 fatcat:yqj65kpvhngovcmgrr46vwwr6i

Argobots: A Lightweight Low-Level Threading and Tasking Framework

Sangmin Seo, Abdelhalim Amer, Pavan Balaji, Cyril Bordage, George Bosilca, Alex Brooks, Philip Carns, Adrian Castello, Damien Genet, Thomas Herault, Shintaro Iwasaki, Prateek Jindal (+9 others)
2018 IEEE Transactions on Parallel and Distributed Systems  
In this paper, we present Argobots, a lightweight, low-level threading and tasking framework that is designed as a portable and performant substrate for high-level programming models or runtime systems  ...  Argobots offers a carefully designed execution model that balances generality of functionality with providing a rich set of controls to allow specialization by end users or high-level programming models  ...  The key challenge in this programming model use case is that it must balance three competing goals: programmability (i.e., ensuring that the service itself is easy to debug and maintain), performance for  ... 
doi:10.1109/tpds.2017.2766062 fatcat:6vusko35zvhvvpykenfva4x35a

Chapter 5 An Environment for Conducting Families of Software Engineering Experiments [chapter]

Lorin Hochstein, Taiga Nakamura, Forrest Shull, Nico Zazworka, Victor R. Basili, Marvin V. Zelkowitz
2008 Advances in Computers  
We have successfully used this environment to study the impact of parallel programming languages in the high-performance computing domain on programmer productivity at multiple universities across the  ...  We have built such a tool that allows us to use various algorithms to manipulate these heuristics.  ...  The duration data associated with each activity also helps us identify interesting events during the development: For example, when a large amount of time is spent debugging, analysts can focus on events  ... 
doi:10.1016/s0065-2458(08)00605-0 fatcat:35bzaqx7pzgzxnodo3wnx623ry

The OpenTM Transactional Application Programming Interface

Woongki Baek, Chi Cao Minh, Martin Trautmann, Christos Kozyrakis, Kunle Olukotun
2007 Parallel Architecture and Compilation Techniques (PACT), Proceedings of the International Conference on  
OpenTM extends OpenMP, a widely used API for shared-memory parallel programming, with a set of compiler directives to express non-blocking synchronization and speculative parallelization based on memory  ...  Overall, OpenTM provides a practical and efficient TM programming environment within the familiar scope of OpenMP.  ...  To date, TM programming has primarily been based on libraries that include special functions to define transaction boundaries, manipulate shared data, and control the runtime system.  ... 
doi:10.1109/pact.2007.4336227 fatcat:nn7gbfngvrff5egt4jzfgurptm

Contech

Brian P. Railing, Eric R. Hein, Thomas M. Conte
2015 ACM Transactions on Architecture and Code Optimization (TACO)  
to understand and exploit program aspects.  ...  Contech: Efficiently generating dynamic task graphs for arbitrary parallel programs. ACM Trans.  ...  First, the program uses a combination of pthreads, OpenMP, or MPI to implement its parallelism. Second, the program is written in C, C++, or Fortran.  ... 
doi:10.1145/2776893 fatcat:mxvw4k5myrgxlocupaxuyzxsku

Embla - Data Dependence Profiling for Parallel Programming

Karl-Filip Faxén, Konstantin Popov, Sverker Jansson, Lars Albertsson
2008 2008 International Conference on Complex, Intelligent and Software Intensive Systems  
In this paper we present a novel tool for aiding programmers in parallelizing programs.  ...  In contrast to static analysis tools, which by necessity make conservative approximation, Embla is able to find more parallelism in sequential programs, and relies on the programmer to transform the program  ...  to the source level using debugging information in the standard way.  ... 
doi:10.1109/cisis.2008.52 dblp:conf/cisis/FaxenPAJ08 fatcat:mwl3wkhy65gmhjcq3pqu5xweu4

Performance Tuning of x86 OpenMP Codes with MAQAO [chapter]

Denis Barthou, Andres Charif Rubial, William Jalby, Souad Koliai, Cédric Valensi
2010 Tools for High Performance Computing 2009  
This paper presents a tool for the performance analysis of multithreaded codes (OpenMP programs support at the moment).  ...  We show on some examples how this can help users improve their OpenMP applications.  ...  We first detail the compression algorithm used in MAQAO and then describe how this method has been adapted to MAQAO for tracing multithreaded codes (OpenMP programs support at the moment).  ... 
doi:10.1007/978-3-642-11261-4_7 dblp:conf/ptw/BarthouRJKV09 fatcat:6lehttx755acvmpyfpiybfnldu
« Previous Showing results 1 — 15 out of 439 results