Filters








59 Hits in 7.5 sec

An empirical study of the CRAY Y-MP processor using the Perfect club benchmarks

Sriram Vajapeyam, Gurindar S. Sohi, Wei-Chung Hsu
1991 Proceedings of the 18th annual international symposium on Computer architecture - ISCA '91  
Second, we present some data regarding the time taken for program execution. Such data are collected using the Hardware Performance Monitor (HPM) available on the CRAY Y-MP.  ...  Therefore, many of the MOV instructions move data from stem in the CRAY X-MP and the C Y Y-MP (the C A -1 had a single memory port, while the X-MP and the Y-MP have three data memory r Benchmark  ... 
doi:10.1145/115952.115970 dblp:conf/isca/VajapeyamSH91 fatcat:v4vc6xqbz5bjllgz7r5cuckvzy

Impact of Quad-Core Cray XT4 System and Software Stack on Scientific Computation [chapter]

S. R. Alam, R. F. Barrett, H. Jagode, J. A. Kuehn, S. W. Poole, R. Sankaran
2009 Lecture Notes in Computer Science  
In this paper, we evaluate impact of a subset of these key changes on large-scale scientific applications.  ...  For instance, we demonstrate that the vectorization instructions (SSE) provide a performance boost of as much as 50% on fusion and combustion applications.  ...  Acknowledgements This work was supported by the United States Department of Defense and used resources of the Extreme Scale Systems Center and the National Center for Computational Sciences at Oak Ridge  ... 
doi:10.1007/978-3-642-03869-3_33 fatcat:pe6kzyururgjbjzrini3fhatly

Scalable, parallel computers: Alternatives, issues, and challenges

Gordon Bell
1994 International journal of parallel programming  
In 1993, Cray Research offers a range of products from $300,000 to over $30 million spanning a performance range of 100, using both CMOS and ECL implementations of the Cray supercomputer architecture.  ...  system, compilers, performance monitoring, etc.).  ...  Two, basic programming paradigms are used: data parallel using a dialect of FORTRAN, such as FORTRAN90, High Performance FORTRAN (HPF), or just FORTRAN 77-multiple copies of a Single Program that operate  ... 
doi:10.1007/bf02577791 fatcat:jnvgpsftabcnnabkmpcm5kifqq

A Parallel Programming Environment

J.R. Allen, K. Kennedy
1985 IEEE Software  
The ing performance to an unacceptable problem with this approach is that level? concurrent programming is unnatural These problems can be solved by a for many scientific programmers.  ...  While the resulting strategy in the Cray X-MP.I In the but will uncover natural programs are usually very efficient on second design, many small processors, a scalar machine, they are often in-say 50 to  ...  to understand ployed for a machine like the Cray global memory, all these words will be the intricacies of programming with X-MP, which has only two processors. updated in the background and the synchronization  ... 
doi:10.1109/ms.1985.231370 fatcat:uffpl34eefbevo5hx3o44kybay

Benchmark Tests on the New IBM RISC System/6000 590 Workstation

Harvey J. Wasserman
1995 Scientific Programming  
A set of well-characterized Fortran benchmarks spanning a range of computational characteristics was used for the study.  ...  The results of benchmark tests on the superscalar IBM RISC System/6000 Model 590 are presented.  ...  To do this we assume that both machines are carrying out the same number of FLOPS for each code and we use the C90 hardware performance monitor to count the FLOPS.  ... 
doi:10.1155/1995/269236 fatcat:eem7obv52jacxccf4zxxdqc3ee

Performance characteristics of an adaptive mesh refinement calculation on scalar and vector platforms

Michael Welcome, Charles Rendleman, Leonid Oliker, Rupak Biswas
2006 Proceedings of the 3rd conference on Computing frontiers - CF '06  
To the best of our knowledge, this is the first work that investigates and characterizes the performance of an AMR calculation on modern parallel-vector systems.  ...  In this paper, we examine the HyperCLaw AMR framework to compare and contrast performance on the Cray X1E, IBM Power3 and Power5, and SGI Altix.  ...  Acknowledgments The authors sincerely thank ORNL for providing access to the X1E. All authors from LBNL were supported by OASCR in the DOE Office of Science under contract DE-AC03-76SF-00098.  ... 
doi:10.1145/1128022.1128074 dblp:conf/cf/WelcomeROB06 fatcat:dnqc6i2scvczhccgmudyhv7qiu

Architectural specification for massively parallel computers: an experience and measurement-based approach

Ron Brightwell, William Camp, Benjamin Cole, Erik DeBenedictis, Robert Leland, James Tomkins, Arthur B. Maccabe
2005 Concurrency and Computation  
We present a comparison of benchmarks and application performance that support our approach. We also project the performance of Red Storm and the Earth simulator.  ...  We discuss the evolution of this architecture and provide reasons for the different choices that have been made.  ...  This line of development was extended in the US to include vector multiprocessors, first seen in the Cray X-MP and extended into the Cray Y-MP, the Cray C-90, the ETA-10, the Cray T-90, and the IBM 390  ... 
doi:10.1002/cpe.893 fatcat:pzr2kymiajeshassp7oiinr2oi

FlexQuery: An online query system for interactive remote visual data exploration at large scale

Hongbo Zou, Karsten Schwan, Magdalena Slawinska, Matt Wolf, Greg Eisenhauer, Fang Zheng, Jai Dayal, Jeremy Logan, Qing Liu, Scott Klasky, Tanja Bode, Michael Clark (+1 others)
2013 2013 IEEE International Conference on Cluster Computing (CLUSTER)  
The remote visual exploration of live data generated by scientific simulations is useful for scientific discovery, performance monitoring, and online validation for the simulation results.  ...  FlexQuery carefully extends such analytics pipelines, using online performance monitoring and data location tracking, to realize data queries in ways that minimize additional data movement and offer low  ...  of running simulations [18] , for scientific validity [16] or to enhance simulation performance [29] .  ... 
doi:10.1109/cluster.2013.6702635 dblp:conf/cluster/ZouSSWEZDLLKBCK13 fatcat:srvykezodbgodp65kdohn3ejkm

Performance measurement, visualization and modeling of parallel and distributed programs using the AIMS toolkit

Jerry Yan, Sekhar Sarukkai, Pankaj Mehra
1995 Software, Practice & Experience  
Writing large-scale parallel and distributed scientific applications that make optimum use of the multiprocessor is a challenging problem.  ...  Using several examples representing a broad range of scientific applications, we illustrate AIMS' effectiveness in exposing performance problems in parallel and distributed programs. j. yan, s. sarukkai  ...  ATexpert, 19 developed for the Cray Y/MP, predicts speed-ups obtainable in various regions of a program as the number of processors is increased.  ... 
doi:10.1002/spe.4380250406 fatcat:obvco3kbenghbpnjdw3wnudoaq

Introduction to parallel computing

1992 ChoiceReviews  
A basic understanding of the parallel computing techniques that assist in the capture and utilization of that computational power is essential to appreciate the capabilities and the limitations of parallel  ...  The relevant techniques, vocabulary, currently available hardware architectures, and programming languages which provide the basic concepts of parallel computing are introduced in this document.  ...  Figure 2 -14 shows the memory access scheme of a two CPU Cray X-MP.  ... 
doi:10.5860/choice.30-1558 fatcat:7qgaeun2ujgcjj3cc3bfbvvbi4

Introduction to parallel computing

2004 ChoiceReviews  
A basic understanding of the parallel computing techniques that assist in the capture and utilization of that computational power is essential to appreciate the capabilities and the limitations of parallel  ...  The relevant techniques, vocabulary, currently available hardware architectures, and programming languages which provide the basic concepts of parallel computing are introduced in this document.  ...  Figure 2 -14 shows the memory access scheme of a two CPU Cray X-MP.  ... 
doi:10.5860/choice.42-0990 fatcat:kiymwofenbavfdixftjhnuavx4

SAGE: Percipient Storage for Exascale Data Centric Computing

Sai Narasimhamurthy, Nikita Danilov, Sining Wu, Ganesan Umanesan, Stefano Markidis, Sergio Rivas-Gomez, Ivy Bo Peng, Erwin Laure, Dirk Pleiter, Shaun de Witt
2018 Parallel Computing  
The SAGE system will be capable of storing and processing immense volumes of data at the Exascale regime, and provide the capability for Exascale class applications to use such a storage infrastructure  ...  The objective of this paper is to discuss the software architecture of the SAGE system and look at early results we have obtained employing some of its key methodologies, as the system continues to evolve  ...  Acknowledgements The authors acknowledge that the SAGE work is being performed by a consortium of members consisting of Seagate(UK), Bull ATOS(France), ARM(UK), KTH(Sweden), STFC(UK), CCFE(UK), Diamond  ... 
doi:10.1016/j.parco.2018.03.002 fatcat:ksduiadutzdtdgnjarcjyjfjpa

A Survey of High-Performance Interconnection Networks in High-Performance Computer Systems

Ping-Jing Lu, Ming-Che Lai, Jun-Sheng Chang
2022 Electronics  
This article analyzes the main interconnection networks used by high-performance computer systems in the Top500 list of November 2021, and it elaborates the design of representative, state-of-the-art,  ...  Its performance and scalability directly affect the performance and scalability of the whole system.  ...  scientific and commercial workloads.  ... 
doi:10.3390/electronics11091369 fatcat:lv7nczjbsbe3jikypuw6bfbc4u

A Performance Evaluation of the Convex SPP-1000 Scalable Shared Memory Parallel Computer

Thomas Sterling, Daniel Savaresse, Peter MacNeice, Kevin Olson, Clark Mobarry, Bruce Fryxell, Phillip Merkey
1995 Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '95  
This paper presents the findings of a set of empirical studies using both synthetic test codes and full applications for the Earth and space sciences to characterize the performance properties of this  ...  The Convex SPP-1000 is the first commercial implementation of a new generation of scalable shared memory parallel computers with full cache coherence.  ...  Acknowledgments This research has been supported by the NASA High Performance Computing and Communication Initiative.  ... 
doi:10.1145/224170.285573 dblp:conf/sc/SterlingSMOMFM95 fatcat:4mt53u4lqzdzjdlzh6xcbiqbc4

D9.2.2: Final Software Evaluation Report

Jose Carlos, Guillaume Colin de Verdière, Matthieu Hautreux, Giannis Koutsou
2012 Zenodo  
The characteristics of these prototypes were selected in order to allow investigation into a number of key aspects relevant to high performance computing, namely interconnects, I/O, energy efficiency and  ...  This deliverable reports on the latest software developments in high performance computing, as identified by the PRACE-1IP, WP9 members.  ...  the shared memory as: x = puold[i]; pould[j] = y; The first version degraded performance by one to two orders of magnitude, due to the fact that the global shared pointer to the shared array belonged to  ... 
doi:10.5281/zenodo.6553027 fatcat:6vbrtqizm5eutmmskf44eltoqq
« Previous Showing results 1 — 15 out of 59 results