543 Hits in 11.7 sec

A design of hybrid operating system for a parallel computer with multi-core and many-core processors

Mikiko Sato, Go Fukazawa, Kiyohiko Nagamine, Ryuichi Sakamoto, Mitaro Namiki, Kazumi Yoshinaga, Yuichi Tsujita, Atsushi Hori, Yutaka Ishikawa
2012 Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers - ROSS '12  
(process, memory, and I/O management) Hybrid computer system overview • Linux on Multi-core CPU works as Host OS  I/O devices and Many-core resources management • Light Weight OS on Many-core  ...  on Many-core CPU is virtualized to a thread  ...  Conclusion • We have proposed a system in which the functions for managing the resources of many-core processors are delegated to the Host OS running on multi-core processor in a parallel computing system  ... 
doi:10.1145/2318916.2318927 fatcat:febvtiheona3tckckgnegvuqve

Task-Based Sparse Hybrid Linear Solver for Distributed Memory Heterogeneous Architectures [chapter]

Emmanuel Agullo, Luc Giraud, Stojce Nakov
2017 Lecture Notes in Computer Science  
However, while each subdomain was processed sequentially (binded onto a single CPU core) in the original solver, the new solver instead relies on task-based local solvers, delegating tasks to available  ...  Indeed, a subdomain can now be processed on multiple CPU cores (such as a whole multicore processor or a subset of the available cores) possibly enhanced with GPUs.  ...  of this study is to propose a prototype extension for which each MPI process can handle heteregenous processing units with a task-based approach, delegating the task management to a runtime system.  ... 
doi:10.1007/978-3-319-58943-5_7 fatcat:snj53iv5lzgnrbudiqkjwombsu

Marrying Many-core Accelerators and InfiniBand for a New Commodity Processor [article]

Konstantin S. Solnushkin, Yuichi Tsujita
2013 arXiv   pre-print
We propose a new heterogeneous processor, equipped with a network controller and designed specifically for HPC.  ...  We then show how it can be used for enterprise computing market, guaranteeing its widespread adoption and therefore low production costs.  ...  Thus we propose a hybrid system, with many-core and multi-core processors within the same compute node. However, MPI is still useful for most applications.  ... 
arXiv:1307.0100v1 fatcat:yxvtz66fcvahvb3wtytdm6nlhe

Evaluation of the SUN UltraSparc T2+ Processor for Computational Science [chapter]

Martin Sandrieser, Sabri Pllana, Siegfried Benkner
2009 Lecture Notes in Computer Science  
A set of benchmarks representing typical building blocks of scientific applications and a real-world hybrid MPI/OpenMP code for ocean simulation are used for performance evaluation.  ...  The Sun UltraSparc T2+ processor was designed for throughput computing and thread level parallelism. In this paper we evaluate its suitability for computational science.  ...  The authors are grateful to Martin Wimmer for numerous discussions and helpful comments regarding the work presented in this paper.  ... 
doi:10.1007/978-3-642-01970-8_97 fatcat:ppz7trki6reptdqig5sqfm35ey

Reverse Offload Programming on Heterogeneous Systems

Cheng Chen, Wenxiang Yang, Fang Wang, Dan Zhao, Yang Liu, Liang Deng, Canqun Yang
2019 IEEE Access  
To achieve high computation throughput, heterogeneous architectures utilize many specialpurpose cores to work as floating point computing coprocessors.  ...  To leverage the limited bandwidth of PCIe, we develop a reverse offload (rOffload) model that treats the autonomous Intel Many Integrated Core (MIC) coprocessor as the host processor while the CPU is treated  ...  We also obtain positive experimental results in porting HPL and HPCG with rOffload.  ... 
doi:10.1109/access.2019.2891740 fatcat:fdikvmay7fg6npy7vaxuubrgqm

A Proposed Architecture for Parallel HPC-based Resource Management System for Big Data Applications

Waleed Al Shehri, Maher Khemakhem, Abdullah Basuhail, Fathy E. Eassa
2019 Advances in Science, Technology and Engineering Systems  
This can be achieved by building a parallel HPC-based Resource Management System to exploit the capabilities of HPC resources efficiently.  ...  Highperformance computing (HPC) is a technology that is used to perform computations as fast as possible.  ...  It uses a data locality API that allows MPI-based programs to obtain data distribution information for compute nodes.  ... 
doi:10.25046/aj040105 fatcat:knbaqf52bbgili4jgu2akdshvy

Modular FEM framework "ModFem" for generic scientific parallel simulations

Michalik Kazimierz, Banas Krzysztof, Plaszewski Przemyslaw, Cybulka Pawel
2013 Computer Science  
We present the design for, and implementation of, a flexible and robust parallel modular finite element (FEM) framework called ModFEM.  ...  The design is based on reusable modules which use narrow and well-defined interfaces to cooperate. At the top of the architecture, there are problem-dependent modules.  ...  MPI-based parallel communication library Examples and results Examples The framework has been tested for a variety of example problems.  ... 
doi:10.7494/csci.2013.14.3.513 fatcat:632rfec2tjg7fgsnssk4jfeyli

Improving Atmospheric Model Performance on a Multi-Core Cluster System [chapter]

Carla Osthoff, Roberto Pinto, Fabrcio Vilasbas, Pablo Grunmann, Pedro L. Silva Dias, Francieli Boito, Rodrigo Kassick, Larcio Pilla, Philippe Navaux, Claudio Schepke, Nicolas Maillard, Jairo Panetta (+2 others)
2012 Atmospheric Model Applications  
atmospheric models for scientists and modelers.  ...  It incorporates various aspects of environmental computer modeling including an historical overview of the subject, approximations to land surface and atmospheric physics and dynamics, radiative transfer  ...  GPUs are "many-core" processors, with hundreds of processing elements.  ... 
doi:10.5772/32484 fatcat:xikghjeqizgergkimtzu7gxlvy

The reverse-acceleration model for programming petascale hybrid systems

S. Pakin, M. Lang, D. J. Kerbyson
2009 IBM Journal of Research and Development  
Current technology trends favor hybrid architectures, typically with each node in a cluster containing both general-purpose and specialized accelerator processors.  ...  The typical model for programming such systems is host-centric: The general-purpose processor orchestrates the computation, offloading performancecritical work to the accelerator, and data are communicated  ...  Computer Entertainment, Inc., in the United States, other countries, or both.  ... 
doi:10.1147/jrd.2009.5429074 fatcat:vaxso4lh35hddmdfdbfbacpww4

Runtime scheduling of dynamic parallelism on accelerator-based multi-core systems

Filip Blagojevic, Dimitrios S. Nikolopoulos, Alexandros Stamatakis, Christos D. Antonopoulos, Matthew Curtis-Maury
2007 Parallel Computing  
The term multi-grain parallelism refers to the exposure of multiple dimensions of parallelism from within the runtime system, so as to best exploit a parallel architecture with heterogeneous computational  ...  Heterogeneous multi-core processors integrate conventional cores that run legacy codes with specialized cores that serve as computational accelerators.  ...  We thank Xizhou Feng and Kirk Cameron for providing us with the MPI version of PBPI. We are also grateful to the anonymous reviewers for their constructive feedback on earlier versions of this paper.  ... 
doi:10.1016/j.parco.2007.09.004 fatcat:zxkvfw76ljedbbqhupvvahwolm

XcalableMP 2.0 and Future Directions [chapter]

Mitsuhisa Sato, Hitoshi Murai, Masahiro Nakao, Keisuke Tsugane, Tesuya Odajima, Jinpil Lee
2020 XcalableMP PGAS Programming Language  
The porting and the performance evaluation were done as a part of this project, and the XcalableMP is available for the Fugaku users for improving the productivity and performance of parallel programing  ...  We conclude this book with retrospectives and challenges for future PGAS models.  ...  The node processor is a single chip, Fujitsu A64FX, which consists of 48 cores with 2 or 4 cores dedicated for OS activities, 32 GiB HBM2 memory, with Tofu-D interconnect, and a PCI express controller  ... 
doi:10.1007/978-981-15-7683-6_10 fatcat:nkzypwi6kbelbet2fpoz6kpk2m

A case study for petascale applications in astrophysics

C. D. Ott, E. Schnetter, G. Allen, E. Seidel, J. Tao, B. Zink
2008 Proceedings of the 15th ACM Mardi Gras conference on From lightweight mash-ups to lambda grids: Understanding the spectrum of distributed computing requirements, applications, tools, infrastructures, interoperability, and the incremental adoption of key capabilities - MG '08  
Here we present a pragmatic case study, focussing on the simulation of gamma-ray bursts as a science driver for petascale computing.  ...  We estimate the computational requirements for such simulations and delineate in what way petascale and peta-grid computing can be utilized in this context.  ...  C.D.O. acknowledges support through a Joint Institute for Nuclear Astrophysics postdoctoral fellowship, sub-award no. 61-5292UA of NFS award no. 86-6004791.  ... 
doi:10.1145/1341811.1341831 dblp:conf/mg/OttSASTZ08 fatcat:kw37h3li7zfrxnqqfqappdezmi

Online codesign on reconfigurable platform for parallel computing

Clément Foucher, Fabrice Muller, Alain Giulieri
2013 Microprocessors and microsystems  
This platform allows the underlying hardware to be virtualized in order to have a generic architecture that can be used to run applications.  ...  Reconfigurable hardware offers new ways of accelerating computing by implementing hardware accelerators at run time.  ...  For many years, parallelism was used only in the professional world, and with intensive computing applications.  ... 
doi:10.1016/j.micpro.2011.12.007 fatcat:bnvnv5dhongzbaxamkl5tjvezm

Distributed Machine Learning for Computational Engineering using MPI [article]

Kailai Xu, Weiqiang Zhu, Eric Darve
2020 arXiv   pre-print
Our parallel computing model views data communication as a node in the computational graph for numerical simulations.  ...  We propose a framework for training neural networks that are coupled with partial differential equations (PDEs) in a parallel computing environment.  ...  Note that ADCME uses hybrid parallel computing models, i.e., a mixture of multithreading programs and MPI communication; therefore, when we talk about one MPI processor, it may contain multiple CPU cores  ... 
arXiv:2011.01349v2 fatcat:rwx5bfrekvc35luscipcuzimdm

FastMPJ: a scalable and efficient Java message-passing library

Roberto R. Expósito, Sabela Ramos, Guillermo L. Taboada, Juan Touriño, Ramón Doallo
2014 Cluster Computing  
The performance and scalability of communications are key for High Performance Computing (HPC) applications in the current multi-core era.  ...  ., productivity, portability, multithreading) of Java for parallel programming, its poor communications support has hindered its adoption in the HPC community.  ...  We also gratefully thank the Advanced School for Computing and Imaging (ASCI) and the Vrije University Amsterdam for providing access to the DAS-4 cluster.  ... 
doi:10.1007/s10586-014-0345-4 fatcat:jxpqqaj3frbrfendjhmims72ty
« Previous Showing results 1 — 15 out of 543 results