Filters








17 Hits in 3.0 sec

Tools for application-oriented performance tuning

John Mellor-Crummey, Robert Fowler, David Whalley
2001 Proceedings of the 15th international conference on Supercomputing - ICS '01  
We discuss some of the critical utility and usability issues for application-level performance analysis tools in the context of two performance tools, MHSim and HPCView, that we built to support our own  ...  HPCView is a tool that combines data from arbitrary sets of instrumentation sources and correlates it with program source code.  ...  Acknowledgments Monika Mevencamp was the principal programmer for the HPCView tool.  ... 
doi:10.1145/377792.377826 dblp:conf/ics/Mellor-CrummeyFW01 fatcat:dz4yutxepzdzpagwothee7b7u4

ParaProf: A Portable, Extensible, and Scalable Tool for Parallel Performance Profile Analysis [chapter]

Robert Bell, Allen D. Malony, Sameer Shende
2003 Lecture Notes in Computer Science  
This paper presents the design, implementation, and application of ParaProf, a portable, extensible, and scalable tool for parallel performance profile analysis.  ...  ParaProf attempts to offer "best of breed" capabilities to performance analysts -those inherited from a rich history of single processor profilers and those being pioneered in parallel tools research.  ...  The HPCView tool [13] best exemplifies the integration of sequential analysis capabilities.  ... 
doi:10.1007/978-3-540-45209-6_7 fatcat:dviap565nfbyhaaim5qyrko77y

A Loop-Aware Search Strategy for Automated Performance Analysis [chapter]

Eli D. Collins, Barton P. Miller
2005 Lecture Notes in Computer Science  
Automated online search is a powerful technique for performance diagnosis.  ...  Such a search can change the types of experiments it performs while the program is running, making decisions based on live performance data.  ...  Unlike Paradyn, which performs online automated performance analysis, HPCView is a postmortem tool that combines the results of several program runs. The DPOMP tool [?]  ... 
doi:10.1007/11557654_68 fatcat:sijzuhh4kje27iwjy6qgmfrigu

A Tool Suite for Simulation Based Analysis of Memory Access Behavior [chapter]

Josef Weidendorfer, Markus Kowarschik, Carsten Trinitis
2004 Lecture Notes in Computer Science  
To get a general purpose, easy-to-use tool suite, the simulation approach allows us to take advantage of runtime instrumentation, i.e. no preparation of application code is needed, and enables for sophisticated  ...  In an ongoing project, research on advanced cache analysis is based on these tools.  ...  We would like to thank Julian Seward for his excellent runtime instrumentation framework, and Nick Nethercote for the cache simulator we based our profiling tool on.  ... 
doi:10.1007/978-3-540-24688-6_58 fatcat:pufwch46dncuvmzk6lmjwn7d64

Knowledge Support and Automation for Performance Analysis with PerfExplorer 2.0

Kevin A. Huck, Allen D. Malony, Sameer Shende, Alan Morris
2008 Scientific Programming  
The integration of scalable performance analysis in parallel development tools is difficult.  ...  In this paper, we will discuss the current version of PerfExplorer, a performance analysis framework which provides dimension reduction, clustering and correlation analysis of individual trails of large  ...  The authors would like to thank PERI, SciDAC, ORNL, NERSC and RENCI for including us in the PERI SciDAC project, and a special thanks to John Mellor-Crummey for a better understanding of the locality issues  ... 
doi:10.1155/2008/985194 fatcat:k2gynig3mnfafndtr7f5aav6ii

Energy Measurement Tools for Ultrascale Computing: A Survey

2015 Supercomputing Frontiers and Innovations  
Such data would enable researchers and designers to pinpoint energy inefficiencies at all levels of the computing stack, from whole nodes down to critical regions of code.  ...  With energy efficiency one of the main challenges on the way towards ultrascale systems, there is a great need for access to high-quality energy consumption data.  ...  Blanco, A. Cabrera for their analysis needs, as well as help guide future developments in hardware and software tools.  ... 
doi:10.14529/jsfi150204 fatcat:ldmrdmt2gngpblhm6ksm2ckwrq

Open | SpeedShop: An Open Source Infrastructure for Parallel Performance Analysis

Martin Schulz, Jim Galarowicz, Don Maghrak, William Hachfeld, David Montoya, Scott Cranford
2008 Scientific Programming  
Over the last decades a large number of performance tools has been developed to analyze and optimize high performance applications.  ...  Open | SpeedShop has two different faces: it provides an interoperable tool set covering the most common analysis steps as well as a comprehensive plugin infrastructure for building new tools.  ...  Acknowledgements Part of this work was performed under the auspices of the US Department of Energy by Lawrence Liver-more National Laboratory under contract DE-AC52-07NA27344 (UCRL-JRNL-234840).  ... 
doi:10.1155/2008/713705 fatcat:7gx7g57ygbagtdyq5wq22w466a

An algebra for cross-experiment performance analysis

F. Song, F. Wolf, N. Bhatia, J. Dongarra, S. Moore
2004 International Conference on Parallel Processing, 2004. ICPP 2004.  
Performance tuning of parallel applications usually involves multiple experiments to compare the effects of different optimization strategies.  ...  The algebra consists of a data model to represent the data in a platformindependent fashion plus arithmetic operations to merge, subtract, and average the data from different experiments.  ...  We are also grateful to Julien Langou for helping us run the PES-CAN code and explaining to us the mathematics behind it.  ... 
doi:10.1109/icpp.2004.1327905 dblp:conf/icpp/SongWBDM04 fatcat:apyju2s2dbftzjtn7jjrilccae

Finding and Removing Performance Bottlenecks in Large Systems [chapter]

Glenn Ammons, Jong-Deok Choi, Manish Gupta, Nikhil Swamy
2004 Lecture Notes in Computer Science  
In the approach, for each kind of profile (for example, calltree profiles), a tool developer implements a simple profile interface that exposes a small set of primitives for selecting summaries of profile  ...  Next, an analyst uses a search tool, which is written to the profile interface and thus independent of the kind of profile, to find bottlenecks.  ...  These summaries are good starting points for a top-down search.  ... 
doi:10.1007/978-3-540-24851-4_8 fatcat:lffd5r7fd5a6fchlmojn4pzzna

The Tau Parallel Performance System

Sameer S. Shende, Allen D. Malony
2006 The international journal of high performance computing applications  
This paper presents the TAU (Tuning and Analysis Utilities) parallel performance system and describe how it addresses diverse requirements for performance observation and analysis.  ...  The ability of performance technology to keep pace with the growing complexity of parallel and distributed systems depends on robust performance frameworks that can at once provide system-specific performance  ...  Acknowledgments Research at the University of Oregon is sponsored by contracts (DE-FG03-01ER25501 and DE-FG02-03ER25561) from the MICS program of the U.S. Dept. of Energy, Office of Science.  ... 
doi:10.1177/1094342006064482 fatcat:tu5rcme47bctdgbrsdq2hzahp4

Low-overhead call path profiling of unmodified, optimized code

Nathan Froyd, John Mellor-Crummey, Rob Fowler
2005 Proceedings of the 19th annual international conference on Supercomputing - ICS '05  
A comparison with instrumentation-based profilers, such as gprof, shows that for call-intensive programs, our sampling-based strategy for call path profiling has over an order of magnitude lower overhead  ...  We describe the design and implementation of a low-overhead call path profiler based on stack sampling.  ...  Acknowledgments Nathan Tallent wrote most of the original csprof code for the Intel Itanium architecture.  ... 
doi:10.1145/1088149.1088161 dblp:conf/ics/FroydMF05 fatcat:wikt6m3ngfc2tdmqdhspab6oci

Automating vertical profiling

Matthias Hauswirth, Amer Diwan, Peter F. Sweeney, Michael C. Mozer
2005 Proceedings of the 20th annual ACM SIGPLAN conference on Object oriented programming systems languages and applications - OOPSLA '05  
Last year at OOPSLA we presented a methodology, vertical profiling, for understanding the performance of objectoriented programs.  ...  Although we explore these activities in the context of vertical profiling, both activities are widely applicable in the performance analysis area.  ...  Any opinions, findings and conclusions or recommendations expressed in this material are the authors' and do not necessarily reflect those of the sponsors.  ... 
doi:10.1145/1094811.1094834 dblp:conf/oopsla/HauswirthDSM05 fatcat:obiafpsvvzgrhhdlvn7phuzkw4

Automating vertical profiling

Matthias Hauswirth, Amer Diwan, Peter F. Sweeney, Michael C. Mozer
2005 SIGPLAN notices  
Last year at OOPSLA we presented a methodology, vertical profiling, for understanding the performance of objectoriented programs.  ...  Although we explore these activities in the context of vertical profiling, both activities are widely applicable in the performance analysis area.  ...  Any opinions, findings and conclusions or recommendations expressed in this material are the authors' and do not necessarily reflect those of the sponsors.  ... 
doi:10.1145/1103845.1094834 fatcat:zqr2jpdwzbcnji2zgczflwxxwq

Vertical profiling

Matthias Hauswirth, Peter F. Sweeney, Amer Diwan, Michael Hind
2004 Proceedings of the 19th annual ACM SIGPLAN Conference on Object-oriented programming, systems, languages, and applications - OOPSLA '04  
We illustrate the efficacy of this approach by providing deep understandings of performance problems of Java applications run on a VM with vertical profiling support.  ...  Thus, understanding system performance of such a system requires profiling that spans all levels of the execution stack, such as the hardware, operating system, virtual machine, and application.  ...  ACKNOWLEDGEMENTS We thank Sam Guyer for the idea of using, and implementation of, object inlining to improve performance of db.  ... 
doi:10.1145/1028976.1028998 dblp:conf/oopsla/HauswirthSDH04 fatcat:k3lzdhsilbfcrk6pax45jdojae

PerfExplorer: A Performance Data Mining Framework For Large-Scale Parallel Computing

K.A. Huck, A.D. Malony
ACM/IEEE SC 2005 Conference (SC'05)  
Examples are given demonstrating these techniques for performance analysis of ASCI applications.  ...  In this paper, we present PerfExplorer, a framework for parallel performance data mining and knowledge discovery.  ...  Clustering Cluster analysis is a valuable tool for reducing large parallel profiles down to representative groups for investigation.  ... 
doi:10.1109/sc.2005.55 dblp:conf/sc/HuckM05 fatcat:x2edtxdfkjesvcmmdwcoeohn5e
« Previous Showing results 1 — 15 out of 17 results