Filters








48 Hits in 2.0 sec

A Scalable Actor-based Programming System for PGAS Runtimes [article]

Sri Raj Paul, Akihiro Hayashi, Kun Chen, Vivek Sarkar
2022 arXiv   pre-print
Our approach can also be viewed as extending the classical Bulk Synchronous Parallelism model with fine-grained asynchronous communications within a phase or superstep.  ...  mini-applications from the Bale benchmark suite executed using 2048 cores in the NERSC Cori system, our approach shows geometric mean performance improvements of >=20x relative to standard PGAS versions (UPC and OpenSHMEM  ...  Further, this approach can be integrated with the standard synchronization constructs in asynchronous task runtimes (e.g., async-finish, future constructs).  ... 
arXiv:2107.05516v3 fatcat:msqq22xjkbhbfb6g64djde74ky

Author index

2014 2014 IEEE International Conference on Cluster Computing (CLUSTER)  
Memory Hierarchy of GPGPU Clusters for Stencil Computations Epema, Dick KOALA-C: A Task Allocator for Integrated Multicluster and Multicloud Environments Espinosa, Antonio Job Scheduling in Hadoop with  ...  Shared Input Policy and RAMDISK Dong, Bin Parallel Query Evaluation as Scientific Data Service Dongarra, Jack Power Monitoring with PAPI for Extreme Scale Architectures and Dataflow-based Programming  ... 
doi:10.1109/cluster.2014.6968663 fatcat:cgfz2ihyufauljzrlt75me7y3e

HiPC 2021 Workshop on Parallel Programming in the Exascale Era (PPEE 2021)

Vivek Kumar, Swarnendu Biswas, Vishwesh Jatala
2021 2021 IEEE 28th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW)  
The upcoming exascale systems will impose new requirements on application developers and programming systems to target platforms with hundreds of homogeneous and heterogeneous cores.  ...  The four critical challenges for exascale systems are extreme parallelism, power demand, data movement, and reliability.  ...  The third idea is to build on an asynchronous tasking runtime within each node, and to extend it with message aggregation and message handling capabilities.  ... 
doi:10.1109/hipcw54834.2021.00014 fatcat:pgtv7sjr7jewnjtgxg5j36n4li

A Framework for Developing Parallel Applications with high level Tasks on Heterogeneous Platforms

Chao Liu, Miriam Leeser
2017 Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores - PMAM'17  
Users can easily implement coarse-grained task parallelism with multiple tasks running concurrently.  ...  The support of both CPU tasks and GPU tasks helps users develop and run parallel applications on heterogeneous platforms.  ...  The runtime system will convert a function declared with async to a task (an activity) that will be scheduled to run asynchronously.  ... 
doi:10.1145/3026937.3026946 dblp:conf/ppopp/LiuL17 fatcat:gelcnml2nfbixcjt5ttdbols24

Extending the OpenSHMEM Memory Model to Support User-Defined Spaces

Aaron Welch, Swaroop Pophale, Pavel Shamis, Oscar Hernandez, Stephen Poole, Barbara Chapman
2014 Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models - PGAS '14  
OpenSHMEM is an open standard for SHMEM libraries.  ...  With the standardisation process complete, the community is looking towards extending the API for increasing programmer flexibility and extreme scalability.  ...  Additionally, providing an abstraction for symmetric memory allocation makes adapting to heterogeneous architectures a simpler task for OpenSHMEM.  ... 
doi:10.1145/2676870.2676884 dblp:conf/pgas/WelchPSHPC14 fatcat:vtvjorwm2rbgljjeyrxf73hmiy

Caching Puts and Gets in a PGAS Language Runtime

Michael P. Ferguson, Daniel Buettner
2015 2015 9th International Conference on Partitioned Global Address Space Programming Models  
The cache is implemented as a software write-back cache with dirty bits, local memory consistency operations, and programmer-guided prefetch.  ...  It supports parallelism within and across nodes via high-level abstractions for data parallelism and task parallelism.  ...  When a page with the prefetch trigger is read, an asynchronous prefetch for the next region needing prefetch is started.  ... 
doi:10.1109/pgas.2015.10 fatcat:qnnnbqzmcnc7foh7osgdarucki

Type oriented parallel programming for Exascale [article]

Nick Brown
2016 arXiv   pre-print
Most of the language functionality is contained within a loosely coupled type library that can be flexibly used to control many aspects such as parallelism.  ...  The programmer is able to write simple, implicit parallel, HPC code at a high level and then explicitly tune by adding additional type information if required.  ...  with asynchronous communication increased the line count by 80%.  ... 
arXiv:1610.08691v1 fatcat:mdniqa7pwfbdpaqhgpnbcxvfly

The GASPI API: A Failure Tolerant PGAS API for Asynchronous Dataflow on Heterogeneous Architectures [chapter]

Christian Simmendinger, Mirko Rahn, Daniel Gruenewald
2014 Sustained Simulation Performance 2014  
In order to achieve its much improved scaling behaviour GASPI leverages request based asynchronous dataflow with remote completion.  ...  A correspondingly implemented fine-grain asynchronous dataflow model can achieve a largely improved scaling behaviour relative to MPI. C. Simmendinger ( )  ...  Parallel to the data transfer, a local task (work) can be executed in order to overlap the communication with the computation.  ... 
doi:10.1007/978-3-319-10626-7_2 fatcat:22fzjvxy2jb4pcamwg4c2af5vi

Type oriented parallel programming for Exascale

Nick Brown
2017 Advances in Engineering Software  
Most of the language functionality is contained within a loosely coupled type library that can be flexibly used to control many aspects such as parallelism.  ...  The programmer is able to write simple, implicit parallel, HPC code at a high level and then explicitly tune by adding additional type information if required.  ...  with asynchronous communication increased the line count by 80%.  ... 
doi:10.1016/j.advengsoft.2017.04.006 fatcat:dpva5buxdjaylcb4kurh7xtknu

Early Evaluation of Scalable Fabric Interface for PGAS Programming Models

Miao Luo, Kayla Seager, Karthik S. Murthy, Charles J. Archer, Sayantan Sur, Sean Hefty
2014 Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models - PGAS '14  
The OpenFabrics Alliance has recently initiated an effort to revamp fabric communication interface to better suit parallel programming models.  ...  The chief distinguishing feature being that the new interfaces are being co-designed along with the applications that use them, such as PGAS communication libraries.  ...  PGAS libraries and compilers face a tough task in providing ordering semantics in a portable manner.  ... 
doi:10.1145/2676870.2676871 dblp:conf/pgas/LuoSMASH14 fatcat:rrgkcvho4ncfjphhv2hfxkztra

FPGA Implementation of On-Chip Network

N Murali Krishna
2018 DJ Journal of Advances in Electronics and Communication Engineering  
Coarse Grained Arrays (CGAs) with run-time re-configurability play a challenging task to design Network on-Chip (NoC) communication systems satisfying the power and area of embedded system.  ...  This paper presents the design of 32 bit UART (Universal Asynchronous Receiver Transmitter) RISC (Reduced Instruction Set Computing) processor with dynamic power management system to minimize power consumption  ...  Interconnection between the resources plays a challenging task [26] .  ... 
doi:10.18831/djece.org/2018021001 fatcat:jfgj5g733zbi5mgkfypfzvn6ga

Fibers are not (P)Threads

Joseph Schuchart, Christoph Niethammer, José Garcia
2020 27th European MPI Users' Group Meeting  
Asynchronous programming models (APM) are gaining more and more traction, allowing applications to expose the available concurrency to a runtime system tasked with coordinating the execution.  ...  We show that this interface is flexible and interacts well with different APMs, namely OpenMP detached tasks, OmpSs-2, and Argobots.  ...  The authors would like to thank Vicenç Beltran Querol and Kevin Sala at Barcelona Supercomputing Center for their support with OmpSs-2 and Joachim Protze for his support with Clang's libomp.  ... 
doi:10.1145/3416315.3416320 dblp:conf/pvm/SchuchartNG20 fatcat:ooxeyz2pwjbcjnw534dwnqovoa

Callback-based completion notification using MPI Continuations

Joseph Schuchart, Philipp Samfass, Christoph Niethammer, José Gracia, George Bosilca
2021 Parallel Computing  
Asynchronous programming models (APM) are gaining more and more traction, allowing applications to expose the available concurrency to a runtime system tasked with coordinating the execution.  ...  We then present some of our first experiences in using the interface in the context of different applications, including the NAS parallel benchmarks, the PaRSEC task-based runtime system, and a load-balancing  ...  With only approximately 15 lines of code (including setup and tear-down), it is possible to integrate MPI communication with a task-parallel application to fully overlap communication and computation.  ... 
doi:10.1016/j.parco.2021.102793 fatcat:nn4rbtlw3fa2dglctm6loio6ie

Optimization of Computationally and I/O Intense Patterns in Electronic Structure and Machine Learning Algorithms

Michal Pitonak, Marian Gall, Adrian Rodriguez-Bazaga, Valeria Bartsch
2019 Zenodo  
Utilization of the full potential of (near-)future supercomputers will most likely require the mastery of massively parallel heterogeneous architectures with multi-tier persistence systems, ideally in  ...  Development of scalable High-Performance Computing (HPC) applications is already a challenging task even in the pre-Exascale era.  ...  GASPI is most closely related to OpenSHMEM, but provides a more general concept of notifications.  ... 
doi:10.5281/zenodo.2807937 fatcat:5szkqofx3bcqpaypsnrqnjtbue

XcalableMP 2.0 and Future Directions [chapter]

Mitsuhisa Sato, Hitoshi Murai, Masahiro Nakao, Keisuke Tsugane, Tesuya Odajima, Jinpil Lee
2020 XcalableMP PGAS Programming Language  
We are now working on the next version, XcalableMP 2.0, for cutting-edge high-performance systems with manycore processors by multithreading and multi-tasking with integrations of PGAS model and synchronization  ...  We conclude this book with retrospectives and challenges for future PGAS models.  ...  We are now working on a new version, XcalableMP 2.0, targeted for cutting-edge high-performance systems with manycore processors by multithreading and multi-tasking with integrations of PGAS model and  ... 
doi:10.1007/978-981-15-7683-6_10 fatcat:nkzypwi6kbelbet2fpoz6kpk2m
« Previous Showing results 1 — 15 out of 48 results