524 Hits in 6.1 sec

A Cluster-as-Accelerator Approach for SPMD-Free Data Parallelism

Maurizio Drocco, Claudia Misale, Marco Aldinucci
2016 2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)  
We implemented the proposed approach in SkeDaTo, a prototyping C++ library of data-parallel skeletons exploiting cluster-as-accelerator at the bottom layer of the runtime software stack.  ...  In this paper we present a novel approach for functional-style programming of distributed-memory clusters, targeting data-centric applications.  ...  SPMD-free data-centric programming We recognize at a fundamental limitation imposed by SPMD when used for programming data-centric distributed applications.  ... 
doi:10.1109/pdp.2016.97 dblp:conf/pdp/DroccoMA16 fatcat:bbbvx77tcbhhbhrtdtppjltyfy

Accelerating and Improving Simulation Performance in Communication Systems Modelling through Parallel Computing and Clustering

Kacper Kapusniak, Tanmay Nautiyal, Ryan Grammenos
2021 Journal of student research  
As most of these simulations require millions of computations, they take a significantly long time to run (for example, days) as they run on single-core machines and carry out the computations serially  ...  while also scaling these scripts onto a multi-core cluster to further improve the execution time.  ...  Finally, the authors are grateful to Dr Tongyang Xu, member of the Information and Communication Engineering (ICE) group in the department of Electronic and Electrical Engineering at UCL, for providing  ... 
doi:10.47611/jsr.v9i2.829 fatcat:fljbkqbofzhszbmkhzwfuwvz6q

Exploring Graphics Processing Unit (GPU) Resource Sharing Efficiency for High Performance Computing

Teng Li, Vikram Narayana, Tarek El-Ghazawi
2013 Computers  
In this paper, we propose to efficiently share the GPU under SPMD and formally define a series of GPU sharing scenarios.  ...  The increasing incorporation of Graphics Processing Units (GPUs) as accelerators has been one of the forefront High Performance Computing (HPC) trends and provides unprecedented performance; however, the  ...  GPU Sharing Scenarios GPU Sharing Approach with Streams for SPMD Programs For a given SPMD program, the program parallelism can be expressed at two different levels: process-level and thread-level.  ... 
doi:10.3390/computers2040176 fatcat:atudxjavxrb6vijetebex2sbju

A technique to automatically determine Ad-hoc communication patterns at runtime

Ana Moreton-Fernandez, Arturo Gonzalez-Escribano, Diego R. Llanos
2017 Parallel Computing  
In this paper, we present a new communication calculation technique to be applied across different SPMD (Single Program Multiple Data) code blocks, containing several uniform data access expressions.  ...  Current High Performance Computing (HPC) systems are typically built as interconnected clusters of shared-memory multicore computers.  ...  research has been partially supported by MICINN (Spain) and ERDF program of the European Union: HomProg-HetSys project (TIN2014-58876-P), CAPAP-H6 (TIN2016-81840-REDT), COST Program Action IC1305: Network for  ... 
doi:10.1016/j.parco.2017.08.009 fatcat:66usedjgaze4tjngawddcnfyxe

Modeling energy consumption of parallel applications

Paweł Czarnul, Jarosław Kuchta, Paweł Rościszewski, Jerzy Proficz
2016 Proceedings of the 2016 Federated Conference on Computer Science and Information Systems  
The paper presents modeling and simulation of energy consumption of two types of parallel applications: geometric Single Program Multiple Data (SPMD) and divide-and-conquer (DAC).  ...  We performed verification of running the two applications on up to 512 and 1024 processes respectively on a large cluster from Academic Computer Center in Gdansk demonstrating a high degree of accuracy  ...  several clusters each with multicore CPUs and accelerators such as GPUs.  ... 
doi:10.15439/2016f308 dblp:conf/fedcsis/CzarnulKRP16 fatcat:hagfwk6iyfg23klflpfdbrhl64

A parallel architecture for video processing [chapter]

D. Turgay Altilar, Yakup Paker, A. Vahit Sahiner
1997 Lecture Notes in Computer Science  
The parallel architecture for video processing has been developed at Queen Mary and Westlield College as a part of an European Union RACE II project called MONALISA.  ...  A multiprocessing kernel and a high level software environment called SAPS (self adapting parallel server) model has been developed for this architecture.  ...  Our architectural approach is based on a scaleable shared address space multi-processors using Single Program Multiple Data (SPMD) parallelism and is aimed at real-time processing of broadcast quality  ... 
doi:10.1007/bfb0031664 fatcat:pqlp26dqnfehbeqr4rzxkfolj4

GPU Computing Taxonomy [chapter]

Abdelrahman Ahmed Mohamed Osman
2017 Recent Progress in Parallel and Distributed Computing  
The main advantage of GPU computing is that it provides cheap parallel processing environments for those who need to solve single program multiple data (SPMD) problems.  ...  Over the past few years, a number of efforts have been made to obtain benefits from graphic processing unit (GPU) devices by using them in parallel computing.  ...  Parallelism of threads in a GPU is suitable for executing the same copy of a single program on different data [single program multiple data (SPMD)], i.e. data parallelism [8] .  ... 
doi:10.5772/intechopen.68179 fatcat:4ur5ykfuwjb5vmwv5a2qiewsqa

MPI microtask for programming the Cell Broadband Engine™ processor

M. Ohara, H. Inoue, Y. Sohda, H. Komatsu, T. Nakatani
2006 IBM Systems Journal  
These differences have led us to a new clustering approach in our static scheduling algorithm. The contribution of this paper is twofold.  ...  First, we propose a microtask model for the Cell BE processor. It frees programmers from explicit local-store management, which could be a significant burden for them.  ...  As a result, each cluster represents a coarse-grain computation assigned for each processor. This approach is suitable for coarse-grain parallel systems, such as clustered workstations.  ... 
doi:10.1147/sj.451.0085 fatcat:6jekxrtumfb6ldz6g6hkh54lze

Implementation and Evaluation of Multiple GridRPC Services for Molecular Dynamics Simulations of Proteins

Takashi Amisaki, Shin-ichi Fujiwara
2006 IPSJ Digital Courier  
This paper reports a protein-simulation grid that uses grid remote procedure calls (GridR-PCs) to a special-purpose cluster machine for molecular dynamics simulations.  ...  Simulations performed using a four-node cluster and a 100-Mbps LAN for GridRPC sessions were 4.6-17.0 times faster than the same simulation performed on the local client PC, while their communication overhead  ...  Hajime Fukuzawa (NEC Corporation) for their To our knowledge, there are currently two new machines: MDG3-system (SGI Japan, Ltd.) built with MDGRAPE-3 boards (RIKEN) and MD Server (NEC Corp.).  ... 
doi:10.2197/ipsjdc.2.573 fatcat:pyz4m5lfjvf2nfgqnrmwxbrk4e

A Robust Background Initialization Algorithm with Superpixel Motion Detection [article]

Zhe Xu, Biao Min, Ray C.C. Cheung
2018 arXiv   pre-print
A low-complexity density-based clustering is then performed to generate reliable background candidates for final background determination.  ...  A subsequence with stable illumination condition is first selected for background estimation.  ...  The most time consuming period is superpixel segmentation and it can be further accelerated with a CPU or GPU based parallel implementation.  ... 
arXiv:1805.06737v1 fatcat:7q6rszwaczgaxktjvsrgk3pq5e

Research Challenges in Parallel and Distributed Simulation

Richard M. Fujimoto
2016 ACM Transactions on Modeling and Computer Simulation  
environments, predictive online simulation for system management and optimization, power and energy consumption in mobile platforms and data centers, and composition of heterogeneous simulations.  ...  A brief overview of research in the field is presented.  ...  For example, in Kunz et al. [2012] , an approach is described that involves sorting events according to event type and clustering their execution to address the problems arising from SPMD execution described  ... 
doi:10.1145/2866577 fatcat:cvkjgwzlangnbpjuu52pdrlvlq

Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs

John A. Stratton, Vinod Grover, Jaydeep Marathe, Bastiaan Aarts, Mike Murphy, Ziang Hu, Wen-mei W. Hwu
2010 Proceedings of the 8th annual IEEE/ ACM international symposium on Code generation and optimization - CGO '10  
over a baseline approach.  ...  In this paper we describe techniques for compiling finegrained SPMD-threaded programs, expressed in programming models such as OpenCL or CUDA, to multicore execution platforms.  ...  Acknowledgments We would like the thank NVIDIA corporation for its support, and our formal and informal reviewers for their feedback.  ... 
doi:10.1145/1772954.1772971 dblp:conf/cgo/StrattonGMAMHH10 fatcat:5fyeryjhwjanvmfdv3tnuo4hw4

Fault tolerance at system level based on RADIC architecture

Marcela Castro-León, Hugo Meyer, Dolores Rexachs, Emilio Luque
2015 Journal of Parallel and Distributed Computing  
In those cases, the affected processes are recovered in a healthy node and the connections are reestablished without losing data.  ...  A prototype has been implemented to carry out an exhaustive experimental evaluation through Master/Worker and Single Program Multiple Data execution models.  ...  When a node of a Linux cluster has a fault while a message passing application is running, the communications with the processes located in the rest of the nodes fail as well and in-transit data might  ... 
doi:10.1016/j.jpdc.2015.08.005 fatcat:shngouhqd5awnm55jfovsccrxi

Direct observation of site-selective hydrogenation and spin-polarization in hydrogenated hexagonal boron nitride on Ni(111)

Manabu Ohtomo, Yasushi Yamauchi, Xia Sun, Alex A. Kuzubov, Natalia S. Mikhaleva, Pavel V. Avramov, Shiro Entani, Yoshihiro Matsumoto, Hiroshi Naramoto, Seiji Sakai
2017 Nanoscale  
Acknowledgements This work is supported by the Grants-in-Aid for Scientific Research (Grant Numbers 23860067, 24760033 and 16H03875) from the Japan Society for the Promotion of Science (JSPS).  ...  A part of this work is performed under the approval of the Photon Factory Advisory Committee (PF PAC No. 2010G660, 2012G741 and 2015G110) of KEK, Japan.  ...  For a comprehensive understanding of hydrogenated h-BN/Ni(111), methods completely free from SIE are strongly awaited.  ... 
doi:10.1039/c6nr06308j pmid:28145546 fatcat:gwn67v6ffbdnbcwdp6wsuqrjfa

SkePU 3: Portable High-Level Programming of Heterogeneous Systems and HPC Clusters

August Ernstsson, Johan Ahlqvist, Stavroula Zouzoula, Christoph Kessler
2021 International journal of parallel programming  
user functions to exploit e.g. custom SIMD instructions, generalized scheduling variants for the multicore CPU backends, and a new cluster-backend targeting the custom MPI interface provided by the StarPU  ...  We have also revised the smart data containers' memory consistency model for automatic data sharing between main and device memory.  ...  as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.  ... 
doi:10.1007/s10766-021-00704-3 fatcat:hddmwhwk75ay7ddyx3glkpkhfu
« Previous Showing results 1 — 15 out of 524 results