782 Hits in 5.4 sec

Performance evaluation of supercomputers using HPCC and IMB benchmarks

S. Saini, R. Ciotti, B.T.N. Gunney, T.E. Spelce, A. Koniges, D. Dossa, P. Adamidis, R. Rabenseifner, S.R. Tiyyagura, M. Mueller, R. Fatoohi
2006 Proceedings 20th IEEE International Parallel & Distributed Processing Symposium  
supercomputers -SGI Altix BX2, Cray X1, Cray Opteron Cluster, Dell Xeon cluster, and NEC SX-8.  ...  [1, 2] .  ...  Performance of vector systems is an order of magnitude better than scalar systems. Between vector systems, performance of NEC SX-8 is better than that of Cray X1.  ... 
doi:10.1109/ipdps.2006.1639622 dblp:conf/ipps/SainiCGSKDARTMF06 fatcat:olviwd6bbve6bhkasohd36rfq4

SPEC OpenMP Benchmarks on Four Generations of NEC SX Parallel Vector Systems [chapter]

Matthias S. Müller
2008 Lecture Notes in Computer Science  
Points of interest are vectorization, scalability and the comparison between different generations of the same family of NEC SX vector supercomputers.  ...  Of special interest is the fact the the NEC SX parallel architecture is not cache consistent.  ...  Basic performance characteristics of the NEC SX family.  ... 
doi:10.1007/978-3-540-68555-5_12 fatcat:vhpptbq6njgu5afuynwptprjty

Performance evaluation of supercomputers using HPCC and IMB Benchmarks

Subhash Saini, Robert Ciotti, Brian T.N. Gunney, Thomas E. Spelce, Alice Koniges, Don Dossa, Panagiotis Adamidis, Rolf Rabenseifner, Sunil R. Tiyyagura, Matthias Mueller
2008 Journal of computer and system sciences (Print)  
supercomputers-SGI Altix BX2, Cray X1, Cray Opteron Cluster, Dell Xeon Cluster, and NEC SX-8.  ...  The HPC Challenge (HPCC) Benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading  ...  Acknowledgments Work by LLNL was performed under the auspices of the U.S. Department of Energy by University of California, Lawrence Livermore National Laboratory under Contract W-7405-Eng-48.  ... 
doi:10.1016/j.jcss.2007.07.002 fatcat:vo4bbxjowzhtlndm5s4h6iionm

Developing an Architecture-independent Graph Framework for Modern Vector Processors and NVIDIA GPUs

2020 Supercomputing Frontiers and Innovations  
Currently VGL supports two classes of architectures: NEC SX-Aurora TSUBASA vector processors and NVIDIA GPUs.  ...  Additionally, in this paper we show how graph algorithms should be implemented and optimised for NVIDIA GPU and NEC SX-Aurora TSUBASA architectures, demonstrating that both architectures have multiple  ...  This paper is distributed under the terms of the Creative Commons Attribution-Non Commercial 3.0 License which permits non-commercial use, reproduction and distribution of the work without further permission  ... 
doi:10.14529/jsfi200404 fatcat:jhlidinsbbhijmtfsjennra3nq

Porting the 3D gyrokinetic particle-in-cell code GTC to the NEC SX-6 vector architecture: perspectives and challenges

S. Ethier, Z. Lin
2004 Computer Physics Communications  
Several years of optimization on the cache-based super-scalar architecture has made it more difficult to port the current version of the 3D particle-in-cell code GTC to the NEC SX-6 vector architecture  ...  After a few modifications, single-processor results show a performance increase of 5.2 compared to the IBM SP Power3 processor, and 2.7 compared to the Power4.  ...  DE-AC020-76-CH03073 and in part by the DOE SciDAC Plasma Microturbulence Project.  ... 
doi:10.1016/j.cpc.2004.06.060 fatcat:sm2c7n5fmbf4jejjqae3lwpt2q

Block-Based Approach to Solving Linear Systems [chapter]

Sunil R. Tiyyagura, Uwe Küster
2007 Lecture Notes in Computer Science  
First, the performance of widely used public domain solvers which target performance on scalar machines is analyzed on a typical vector machine.  ...  Then, a newly developed parallel sparse iterative solver (Block-based Linear Iterative Solver -BLIS) targeting performance on both scalar and vector systems is introduced and the time needed for solving  ...  Fig. 1 . 1 Single CPU performance of Sparse MVP on NEC SX-8 The fluid is assumed to be incompressible Newtonian with a kinematic viscosity ν = 10 −3 m 2 /s and a density of ρ = 1.0 kg/m 3 .  ... 
doi:10.1007/978-3-540-72584-8_17 fatcat:4m4exqyhu5eu5gazl6qvcuylri

SX-ACE processor: NEC's brand-new vector processor

Shintaro Momose
2014 2014 IEEE Hot Chips 26 Symposium (HCS)  
Any single architecture cannot cover all application areas.The SX-ACE core can provide the world top-level performance and the largest memory bandwidthEvaluation of Indirect memory access performance  ...  560LSI 30KW Many LSI consuming more than 70% of the power memory 6LSI 2.8KW SX-ACE 6nodes 1.5TF Power efficient Number of LSI reduced to 1/100 High performance maintained  ... 
doi:10.1109/hotchips.2014.7478805 dblp:conf/hotchips/Momose14 fatcat:ehmy7u2qvrgwfc6bhsmwbdzb5y

The Teraflop Workbench — Enhancing High Sustained Performance on Vector Systems

Sabine ROLLER, Michael RESCH, Martin GALLE, Wolfgang BEZ
2009 Interdisciplinary Information Sciences  
Performance Computing resources, and they all show high sustained performance on the HPC vector system NEC-SX8.  ...  The paper also describes future needs to address when describing the next generation of applications.  ...  Fig. 1 . 1 Installation Fig. 3 . 3 Maximum sustained performance achievable on HLRS' NEC SX- Fig. 4 . 4 Single CPU performance of Sparse MVP on NEC SX-8.48ROLLER et al.  ... 
doi:10.4036/iis.2009.45 fatcat:kd5pcjvn5fb4jgyfa36ov64mvm

Distributed Graph Algorithms for Multiple Vector Engines of NEC SX-Aurora TSUBASA Systems

2021 Supercomputing Frontiers and Innovations  
Acknowledgements The research is carried out using the equipment of the shared research facilities of HPC computing resources at Lomonosov Moscow State University and the computational resources of Cyberscience  ...  The work presented in Section 5 is supported by Russian Ministry of Science and Higher Education, agreement No. 075-15-2019-1621.  ...  NEC SX-Aurora TSUBASA Architecture NEC has developed vector computing systems called SX series since SX-2 released in 1983 to SX-9 [33], and SX-ACE [13] .  ... 
doi:10.14529/jsfi210206 fatcat:cxp7bhbo6vfrpf43soxyokfw4e

Characteristics of an On-Chip Cache on NEC SX Vector Architecture

Akihiro MUSA, Yoshiei SATO, Ryusuke EGAWA, Hiroyuki TAKIZAWA, Koki OKABE, Hiroaki KOBAYASHI
2009 Interdisciplinary Information Sciences  
We evaluate the performance of the vector cache on the NEC SX vector processor architecture with bytes per flop rates of 2 B/FLOP and 1 B/FLOP, to clarify the basic characteristics of the vector cache.  ...  For the evaluation, we use the NEC SX-7 simulator extended with the vector cache mechanism.  ...  Sawaya of Tohoku University, for their advice about applications. We also would like to thank Matsuo Aoyama and Hirofumi Akiba of NEC Software Tohoku for their assistance in experiments.  ... 
doi:10.4036/iis.2009.51 fatcat:jqdjhyjjazdytdsjpv7lsf5n6m

Evaluating the Performance of OpenMP Offloading on the NEC SX-Aurora TSUBASA Vector Engine

2021 Supercomputing Frontiers and Innovations  
The NEC SX-Aurora TSUBASA vector engine (VE) follows the tradition of long vector processors for high-performance computing (HPC).  ...  We assess the functionality and present the first performance numbers of real-world HPC kernels.  ...  This paper is distributed under the terms of the Creative Commons Attribution-Non Commercial 3.0 License which permits non-commercial use, reproduction and distribution of the work without further permission  ... 
doi:10.14529/jsfi210204 fatcat:3mqbc6eybbd33hoh6a2bnseime

First Experience of Accelerating a Field-Induced Chiral Transition Simulation Using the SX-Aurora TSUBASA

2021 Supercomputing Frontiers and Innovations  
The newly emerged SX-Aurora TSUBASA, the successor of the SX-ACE processor, is expected to provide much higher performance to the programs executed on the SX-ACE as is.  ...  For acceleration of the FICT, improvement of the vectorization ratio in the program execution and the efficient transfer of data to the general purpose processor as the vector host from the vector processor  ...  Acknowledgements We thank the Cybermedia Center (CMC) of Osaka University which provided the SX-ACE system and the SX-Aurora TSUBASA system.  ... 
doi:10.14529/jsfi210203 fatcat:frnp4t6j4zcv3f47pum4sfaklu

Network Bandwidth Measurements and Ratio Analysis with the HPC Challenge Benchmark Suite (HPCC) [chapter]

Rolf Rabenseifner, Sunil R. Tiyyagura, Matthias Müller
2005 Lecture Notes in Computer Science  
Balance Analysis with HPC Challenge Benchmark Data • How HPCC data can be used to analyze the balance of HPC systems • Details on ring based benchmarks • Resource based ratios • Inter-node bandwidth and  ...  • memory bandwidth • versus computational speed • HPCC footprint • Comparing the platforms  ...  between memory and CPU High memory bandwidth ratio on vector-type systems (NEC SX-6, SX-8, Cray X1), but also on Cray XT3.  ... 
doi:10.1007/11557265_48 fatcat:lhh6zb2nuremlaw5qkxbdalb24

Optimisation of a spline based Eulerian--Lagrangian transport solver

S. J. Leak, M. G. Trefry, F. P. Ruan
2005 ANZIAM Journal  
We discuss the optimisation and parallelisation of a serial, spline based Eulerian-Lagrangian code (elm2d, Fortran 90) on a 64 bit nec sx-5 platform to support high-resolution numerical experiments.  ...  Profiling analysis indicated potential inefficiencies in the spline and diffusion subsystems of the code. Vectorisation of these subsystems achieved more than an order of magnitude speed increase.  ...  Acknowledgment: This work was performed with the assistance of the High Performance Computing and Communications Centre, Melbourne, Australia.  ... 
doi:10.21914/anziamj.v46i0.1005 fatcat:f5md53kplrdhfnpy6zd6qp37bq

A Memory-Efficient Implementation of a Plasmonics Simulation Application on SX-ACE

Raghunandan Mathur, Hiroshi Matsuoka, Osamu Watanabe, Akihiro Musa, Ryusuke Egawa, Hiroaki Kobayashi
2016 International Journal of Networking and Computing  
To validate the effectiveness of our approaches, a plasmonics simulation application is evaluated on vector platforms NEC SX-ACE, NEC SX-9, and Intel Xeon based platform NEC LX 406-Re2.  ...  In this paper, we discuss a set of approaches to optimization of the memory access behavior of the applications, which enable their executions with improved performance on the recent HPC systems.  ...  Acknowledgments The authors would like to thank Mr. Sourav Saha of NEC Technologies India for his continuous efforts for the memory optimization and application evaluation.  ... 
doi:10.15803/ijnc.6.2_243 fatcat:ugvscscszbefjomyyix52g63tq
« Previous Showing results 1 — 15 out of 782 results