A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2010; you can also visit the original URL.
The file type is application/pdf
.
Filters
Performance evaluation of supercomputers using HPCC and IMB benchmarks
2006
Proceedings 20th IEEE International Parallel & Distributed Processing Symposium
supercomputers -SGI Altix BX2, Cray X1, Cray Opteron Cluster, Dell Xeon cluster, and NEC SX-8. ...
[1, 2] . ...
Performance of vector systems is an order of magnitude better than scalar systems. Between vector systems, performance of NEC SX-8 is better than that of Cray X1. ...
doi:10.1109/ipdps.2006.1639622
dblp:conf/ipps/SainiCGSKDARTMF06
fatcat:olviwd6bbve6bhkasohd36rfq4
SPEC OpenMP Benchmarks on Four Generations of NEC SX Parallel Vector Systems
[chapter]
2008
Lecture Notes in Computer Science
Points of interest are vectorization, scalability and the comparison between different generations of the same family of NEC SX vector supercomputers. ...
Of special interest is the fact the the NEC SX parallel architecture is not cache consistent. ...
Basic performance characteristics of the NEC SX family. ...
doi:10.1007/978-3-540-68555-5_12
fatcat:vhpptbq6njgu5afuynwptprjty
Performance evaluation of supercomputers using HPCC and IMB Benchmarks
2008
Journal of computer and system sciences (Print)
supercomputers-SGI Altix BX2, Cray X1, Cray Opteron Cluster, Dell Xeon Cluster, and NEC SX-8. ...
The HPC Challenge (HPCC) Benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading ...
Acknowledgments Work by LLNL was performed under the auspices of the U.S. Department of Energy by University of California, Lawrence Livermore National Laboratory under Contract W-7405-Eng-48. ...
doi:10.1016/j.jcss.2007.07.002
fatcat:vo4bbxjowzhtlndm5s4h6iionm
Developing an Architecture-independent Graph Framework for Modern Vector Processors and NVIDIA GPUs
2020
Supercomputing Frontiers and Innovations
Currently VGL supports two classes of architectures: NEC SX-Aurora TSUBASA vector processors and NVIDIA GPUs. ...
Additionally, in this paper we show how graph algorithms should be implemented and optimised for NVIDIA GPU and NEC SX-Aurora TSUBASA architectures, demonstrating that both architectures have multiple ...
This paper is distributed under the terms of the Creative Commons Attribution-Non Commercial 3.0 License which permits non-commercial use, reproduction and distribution of the work without further permission ...
doi:10.14529/jsfi200404
fatcat:jhlidinsbbhijmtfsjennra3nq
Porting the 3D gyrokinetic particle-in-cell code GTC to the NEC SX-6 vector architecture: perspectives and challenges
2004
Computer Physics Communications
Several years of optimization on the cache-based super-scalar architecture has made it more difficult to port the current version of the 3D particle-in-cell code GTC to the NEC SX-6 vector architecture ...
After a few modifications, single-processor results show a performance increase of 5.2 compared to the IBM SP Power3 processor, and 2.7 compared to the Power4. ...
DE-AC020-76-CH03073 and in part by the DOE SciDAC Plasma Microturbulence Project. ...
doi:10.1016/j.cpc.2004.06.060
fatcat:sm2c7n5fmbf4jejjqae3lwpt2q
Block-Based Approach to Solving Linear Systems
[chapter]
2007
Lecture Notes in Computer Science
First, the performance of widely used public domain solvers which target performance on scalar machines is analyzed on a typical vector machine. ...
Then, a newly developed parallel sparse iterative solver (Block-based Linear Iterative Solver -BLIS) targeting performance on both scalar and vector systems is introduced and the time needed for solving ...
Fig. 1 . 1 Single CPU performance of Sparse MVP on NEC SX-8
The fluid is assumed to be incompressible Newtonian with a kinematic viscosity ν = 10 −3 m 2 /s and a density of ρ = 1.0 kg/m 3 . ...
doi:10.1007/978-3-540-72584-8_17
fatcat:4m4exqyhu5eu5gazl6qvcuylri
SX-ACE processor: NEC's brand-new vector processor
2014
2014 IEEE Hot Chips 26 Symposium (HCS)
Any single architecture cannot cover all application areas.The SX-ACE core can provide the world top-level performance and the largest memory bandwidthEvaluation of Indirect memory access performance ...
560LSI
30KW
Many
LSI
consuming
more
than
70%
of
the
power
memory
6LSI
2.8KW
SX-ACE 6nodes 1.5TF
Power efficient
Number of LSI
reduced to 1/100
High performance
maintained ...
doi:10.1109/hotchips.2014.7478805
dblp:conf/hotchips/Momose14
fatcat:ehmy7u2qvrgwfc6bhsmwbdzb5y
The Teraflop Workbench — Enhancing High Sustained Performance on Vector Systems
2009
Interdisciplinary Information Sciences
Performance Computing resources, and they all show high sustained performance on the HPC vector system NEC-SX8. ...
The paper also describes future needs to address when describing the next generation of applications. ...
Fig. 1 . 1 Installation
Fig. 3 . 3 Maximum sustained performance achievable on HLRS' NEC SX-
Fig. 4 . 4 Single CPU performance of Sparse MVP on NEC SX-8.48ROLLER et al. ...
doi:10.4036/iis.2009.45
fatcat:kd5pcjvn5fb4jgyfa36ov64mvm
Distributed Graph Algorithms for Multiple Vector Engines of NEC SX-Aurora TSUBASA Systems
2021
Supercomputing Frontiers and Innovations
Acknowledgements The research is carried out using the equipment of the shared research facilities of HPC computing resources at Lomonosov Moscow State University and the computational resources of Cyberscience ...
The work presented in Section 5 is supported by Russian Ministry of Science and Higher Education, agreement No. 075-15-2019-1621. ...
NEC SX-Aurora TSUBASA Architecture NEC has developed vector computing systems called SX series since SX-2 released in 1983 to SX-9 [33], and SX-ACE [13] . ...
doi:10.14529/jsfi210206
fatcat:cxp7bhbo6vfrpf43soxyokfw4e
Characteristics of an On-Chip Cache on NEC SX Vector Architecture
2009
Interdisciplinary Information Sciences
We evaluate the performance of the vector cache on the NEC SX vector processor architecture with bytes per flop rates of 2 B/FLOP and 1 B/FLOP, to clarify the basic characteristics of the vector cache. ...
For the evaluation, we use the NEC SX-7 simulator extended with the vector cache mechanism. ...
Sawaya of Tohoku University, for their advice about applications. We also would like to thank Matsuo Aoyama and Hirofumi Akiba of NEC Software Tohoku for their assistance in experiments. ...
doi:10.4036/iis.2009.51
fatcat:jqdjhyjjazdytdsjpv7lsf5n6m
Evaluating the Performance of OpenMP Offloading on the NEC SX-Aurora TSUBASA Vector Engine
2021
Supercomputing Frontiers and Innovations
The NEC SX-Aurora TSUBASA vector engine (VE) follows the tradition of long vector processors for high-performance computing (HPC). ...
We assess the functionality and present the first performance numbers of real-world HPC kernels. ...
This paper is distributed under the terms of the Creative Commons Attribution-Non Commercial 3.0 License which permits non-commercial use, reproduction and distribution of the work without further permission ...
doi:10.14529/jsfi210204
fatcat:3mqbc6eybbd33hoh6a2bnseime
First Experience of Accelerating a Field-Induced Chiral Transition Simulation Using the SX-Aurora TSUBASA
2021
Supercomputing Frontiers and Innovations
The newly emerged SX-Aurora TSUBASA, the successor of the SX-ACE processor, is expected to provide much higher performance to the programs executed on the SX-ACE as is. ...
For acceleration of the FICT, improvement of the vectorization ratio in the program execution and the efficient transfer of data to the general purpose processor as the vector host from the vector processor ...
Acknowledgements We thank the Cybermedia Center (CMC) of Osaka University which provided the SX-ACE system and the SX-Aurora TSUBASA system. ...
doi:10.14529/jsfi210203
fatcat:frnp4t6j4zcv3f47pum4sfaklu
Network Bandwidth Measurements and Ratio Analysis with the HPC Challenge Benchmark Suite (HPCC)
[chapter]
2005
Lecture Notes in Computer Science
Balance Analysis with HPC Challenge Benchmark Data • How HPCC data can be used to analyze the balance of HPC systems • Details on ring based benchmarks • Resource based ratios • Inter-node bandwidth and ...
• memory bandwidth • versus computational speed • HPCC footprint • Comparing the platforms ...
between memory and CPU High memory bandwidth ratio on vector-type systems (NEC SX-6, SX-8, Cray X1), but also on Cray XT3. ...
doi:10.1007/11557265_48
fatcat:lhh6zb2nuremlaw5qkxbdalb24
Optimisation of a spline based Eulerian--Lagrangian transport solver
2005
ANZIAM Journal
We discuss the optimisation and parallelisation of a serial, spline based Eulerian-Lagrangian code (elm2d, Fortran 90) on a 64 bit nec sx-5 platform to support high-resolution numerical experiments. ...
Profiling analysis indicated potential inefficiencies in the spline and diffusion subsystems of the code. Vectorisation of these subsystems achieved more than an order of magnitude speed increase. ...
Acknowledgment: This work was performed with the assistance of the High Performance Computing and Communications Centre, Melbourne, Australia. ...
doi:10.21914/anziamj.v46i0.1005
fatcat:f5md53kplrdhfnpy6zd6qp37bq
A Memory-Efficient Implementation of a Plasmonics Simulation Application on SX-ACE
2016
International Journal of Networking and Computing
To validate the effectiveness of our approaches, a plasmonics simulation application is evaluated on vector platforms NEC SX-ACE, NEC SX-9, and Intel Xeon based platform NEC LX 406-Re2. ...
In this paper, we discuss a set of approaches to optimization of the memory access behavior of the applications, which enable their executions with improved performance on the recent HPC systems. ...
Acknowledgments The authors would like to thank Mr. Sourav Saha of NEC Technologies India for his continuous efforts for the memory optimization and application evaluation. ...
doi:10.15803/ijnc.6.2_243
fatcat:ugvscscszbefjomyyix52g63tq
« Previous
Showing results 1 — 15 out of 782 results