Filters








8,528 Hits in 3.5 sec

Tracing application program execution on the CRAY X-MP and CRAY-2

Allen D. Malony, John L. Larson, Daniel A. Reed
1991 Journal of Supercomputing  
Unfortunately, many high-performance machines provide execution profile summaries as the only tool for performance investigation.  ...  Our conclusion is that adding tracing support in Cray supercomputers can have significant returns in improved performance characterization and evaluation.  ...  Standard Cray Tools Two profiling tools are commonly used on Cray systems: Flowtrace and Perftrace. The Flowtrace tool [2] is available on the Cray X-MP, Cray Y-MP, and Cray 2.  ... 
doi:10.1007/bf00127841 fatcat:5zjgqu6eenfqdggk7ypkxbyfde

A High Level Programming Environment for Accelerator-based Systems

Luiz DeRose, Heidi Poxon, James Beyer, Alistair Hart
2014 Procedia Computer Science  
In this paper we present the Cray high level programming environment fo r accelerated computing, which tightly integrates compilers, tools, and scientific libraries.  ...  In Section 3 we present the Cray programming environment for accelerated co mputing, focusing on the programmab ility features in the co mpiler, tools, and lib raries.  ...  The main design goal of the Cray performance tools for accelerated co mputing is to provide the user with an integrated performance measurement and analysis toolset that would view the hybrid system (host  ... 
doi:10.1016/j.procs.2014.05.134 fatcat:hiocitrfhrcwdbla4vezlr3bba

Evaluating Parallel Programming Tools to Support Code Development for Accelerators

Rebecca Hartman-Baker, Valerie Maxville, Daniel Grimwood
2014 Procedia Computer Science  
While accelerators provide impressive performance and efficiency, an important factor in this decision is the usability of the technologies.  ...  To assist in the assessment of technologies, iVEC conducted a code sprint where iVEC staff and advanced users were paired to make use of a range of tools to port their codes to two architectures.  ...  We are gratefu l for access to systems provided by CSCS (Do mmic -Xeon Phi, Tod i -Cray tools), The University of Tennessee (Beacon -Xeon Phi) and ORNL (Titan -NVIDIA Kepler).  ... 
doi:10.1016/j.procs.2014.05.191 fatcat:lghoyvonl5dszgsid4az3vca3a

A Shared Memory MPP from Cray Research

R. Kent Koeninger, Mark Furtney, Martin Walker
1994 Digital technical journal of Digital Equipment Corporation  
The CRAY T3D system is the first massively parallel processor from Cray Research. The implementation entailed the design of system software, hardware, languages, and tools.  ...  Additional topics include latency-hiding and synchronization hardware, libraries, operating system, and tools.  ...  A key software tool is the MPP Apprentice, T3D system by emulating CRAY T3D codes on any CRAY Y-MP system. a performance analysis tool based in part on ideas developed by The emulator supports Fortran  ... 
dblp:journals/dtj/KoeningerFW94 fatcat:34gbfv33njbvpa2usuxlcxzose

Towards Parallel Performance Analysis Tools for the OpenSHMEM Standard [chapter]

Sebastian Oeste, Andreas Knüpfer, Thomas Ilsche
2014 Lecture Notes in Computer Science  
original OpenSHMEM 2013/2014 workshop paper: Sebastian Oeste , Andreas Knüpfer, Thomas Ilsche: Towards Parallel Performance Analysis Tools for the OpenSHMEM Standard Demonstrator for Cray SHMEM  ...  ., atomic operations Many other event types reused -Enter and Leave for API calls and user routine calls -Performance counter samples Demonstrator based on Cray SHMEM and VampirTrace From the  ... 
doi:10.1007/978-3-319-05215-1_7 fatcat:bdx2m57c6rf3fhvfn4bf5zdrfa

Performance Assessment of OpenMP Compilers Targeting NVIDIA V100 GPUs [article]

Joshua Hoke Davis, Christopher Daley, Swaroop Pophale, Thomas Huber, Sunita Chandrasekaran, Nicholas J. Wright
2020 arXiv   pre-print
We observe that the performance varies widely from one compiler to the other; a crucial aspect of our work is reporting best practices to application developers who use OpenMP offloading compilers.  ...  18x speedup for the su3 proxy application on NERSC's Cori system when using the Clang compiler, and a 15.7x speedup by switching max reductions to add reductions in the laplace mini-app when using the Cray-llvm  ...  Knowing which compilers perform poorly for a given application, we use profiling tools to uncover the underlying reasons for such poor performance.  ... 
arXiv:2010.09454v3 fatcat:gjznbreafrbsreicsgj5xgnx4y

LARGE-SCALE PERFORMANCE ANALYSIS OF SWEEP3D WITH THE SCALASCA TOOLSET

BRIAN J. N. WYLIE, MARKUS GEIMER, BERND MOHR, DAVID BÖHME, ZOLTÁN SZEBENYI, FELIX WOLF
2010 Parallel Processing Letters  
As part of its ongoing development, execution performance with up to 128k processor cores on Cray XT and IBM BG/P systems has been investigated, and a variety of aspects have been identified to inhibit  ...  PFLOTRAN performance at larger scales using the open-source Scalasca toolset.  ...  It was supported by allocations of advanced computing resources on the Jugene IBM Blue Gene/P of Jülich Supercomputing Centre at Forschungszentrum Jülich and the Jaguar Cray XT5 of the National Center  ... 
doi:10.1142/s0129626410000314 fatcat:wqvm53qy2bhe5apg4n6d5fz75i

Fly On Cray: Porting, Optimization And Performance Analysis Of Cosmological Simulation Code Fly On Cray Xe6 Architecture

M.Cytowski
2013 Zenodo  
Hermit (Cray XE6).  ...  In this whitepaper we report work that was done to investigate and improve the performance of a mixed MPI and OpenMP implementation of the FLY code for cosmological simulations on a PRACE Tier-0 system  ...  Therefore the Cray compiler was chosen for the remainder of the project. Performance analysis In this section we describe the performance analysis steps that we have implemented.  ... 
doi:10.5281/zenodo.831478 fatcat:kbbvaet6s5b3xnuxg4f34r7n3m

ACORN: APL to C on real numbers

Robert Bernecky, Charles Brenner, Stephen B. Jaffe, George P. Moeckel
1990 Conference proceedings on APL 90: for the future - APL '90  
ACORN currently produces code which runs slower than hand-coded Cray FORTRAN, but we have identified the major performance bottlenecks, and believe we know how to remove them.  ...  A prototype APL to C compiler (ACORN: APL to C On Real Numbers) was produced while investigating improved tools for solving numerically intensive problems on supercomputers.  ...  Don Isgitt and Dale Mihalyi assisted us in the generation of seismic test data and educated us in the use of the Cray. Elena Anzalone edited the report, improving its readability and organization.  ... 
doi:10.1145/97808.97821 dblp:conf/apl/BerneckyBJM90 fatcat:qdpx6sikbbh3xgmhod4olhjqzi

Supercomputing for the parallelization of whole genome analysis

M. J. Puckelwartz, L. L. Pesce, V. Nelakuditi, L. Dellefave-Castillo, J. R. Golbus, S. M. Day, T. P. Cappola, G. W. Dorn, I. T. Foster, E. M. McNally
2014 Bioinformatics  
Results: We now adapted a Cray XE6 supercomputer to achieve the parallelization required for concurrent multiple genome analysis.  ...  Availability and implementation: The MegaSeq workflow is designed to harness the size and memory of the Cray XE6, housed at Argonne National Laboratory, for whole genome analysis in a platform designed  ...  On smaller datasets, the CRAY XE6 has the capability to perform recalibration across the genome in a reasonable time frame.  ... 
doi:10.1093/bioinformatics/btu071 pmid:24526712 pmcid:PMC4029034 fatcat:zfbyftzrr5cjdnzqrtzxtprlti

ACORN: APL to C on real numbers

Robert Bernecky, Charles Brenner, Stephen B. Jaffe, George P. Moeckel
1990 ACM SIGAPL APL Quote Quad  
ACORN currently produces code which runs slower than hand-coded Cray FORTRAN, but we have identified the major performance bottlenecks, and believe we know how to remove them.  ...  A prototype APL to C compiler (ACORN: APL to C On Real Numbers) was produced while investigating improved tools for solving numerically intensive problems on supercomputers.  ...  Don Isgitt and Dale Mihalyi assisted us in the generation of seismic test data and educated us in the use of the Cray. Elena Anzalone edited the report, improving its readability and organization.  ... 
doi:10.1145/97811.97821 fatcat:r7dxgxpuxbhwxaqxjtyti77hwe

Diagnosing performance bottlenecks in emerging petascale applications

Nathan R. Tallent, John M. Mellor-Crummey, Laksono Adhianto, Michael W. Fagan, Mark Krentel
2009 Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis - SC '09  
In this paper, we describe HPCToolkit-a suite of multi-platform tools that supports sampling-based analysis of application performance on emerging petascale platforms.  ...  Consequently, there is a critical need for performance tools that enable scientists to understand impediments to performance on emerging petascale systems.  ...  RELATED WORK Most studies of application scaling on petascale systems have relied on manual analysis rather than sophisticated performance tools to understand scalability [1] [2] [3] 13] .  ... 
doi:10.1145/1654059.1654111 dblp:conf/sc/TallentMAFK09 fatcat:ydfk4ojum5f3rkr7fc757s7vdu

Page 10 of American Society of Civil Engineers. Collected Journals Vol. 116, Issue CP1 [page]

1990 American Society of Civil Engineers. Collected Journals  
The computational performance of NETSIM on the CRAY is then related to the characteristics of the network, yielding a tool for the prediction of execution- time requirements for other applications.  ...  In the following analysis, we assess the reduction in execution times on the CRAY relative to mainframe processing both with and without the compiler-vectorization capabilities.  ... 

Performance analysis and Optimisation of the Met Unified Model on a Cray XC30 [article]

Karthee Sivalingam and Grenville Lister and Bryan Lawrence
2015 arXiv   pre-print
On ARCHER, we use Cray Performance Analysis Tools (CrayPAT) to analyse the performance of UM and then Cray Reveal to identify and parallelise serial loops using OpenMP directives.  ...  Nonetheless, it is clear that the investment of months in analysis and optimisation has yielded performance gains that correspond to the saving of tens of millions of core-hours on current climate projects  ...  Cray Reveal is an integrated performance analysis and code optimisation tool. It provides loop analysis and scoping of serial loops and suggests OpenMP directives that can be inserted to a loop.  ... 
arXiv:1511.03885v1 fatcat:jt55dilzkfavve76ivich57uzm

The Eclipse parallel tools platform

Jay Alameda, Wyatt Spear, Jeffrey L. Overbey, Kevin Huck, Gregory R. Watson, Beth Tibbitts
2012 Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment on Bridging from the eXtreme to the campus and beyond - XSEDE '12  
The Parallel Tools Platform (PTP) [2] extends Eclipse to support development on high performance computers.  ...  include submission and monitoring of jobs on systems running Sun/Oracle Grid Engine, support for GSI authentication and MyProxy logon, support for environment modules, and integration with compilers from Cray  ...  effort of the University of Illinois at Urbana-Champaign, its National Center for Supercomputing Applications, Cray, and the Great Lakes Consortium for Petascale Computation.  ... 
doi:10.1145/2335755.2335845 fatcat:kwwcwrpfb5bu7iov6cvn5uir5a
« Previous Showing results 1 — 15 out of 8,528 results