Filters








3,411 Hits in 3.1 sec

Asynchronous Execution of Python Code on Task-Based Runtime Systems

R. Tohid, Bibek Wagle, Shahrzad Shirzad, Patrick Diehl, Adrian Serio, Alireza Kheirkhahan, Parsa Amini, Katy Williams, Kate Isaacs, Kevin Huck, Steven Brandt, Hartmut Kaiser
2018 2018 IEEE/ACM 4th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2)  
variables into a dependency tree executed by HPX, a general purpose, parallel, task-based runtime system written in C++.  ...  Phylanx, is an asynchronous array processing toolkit which transforms Python and NumPy operations into code which can be executed in parallel on HPC resources by mapping Python and NumPy functions and  ...  Phylanx tackles this issue by providing a framework that can execute arbitrary Python code in a distributed setting using an asynchronous many-task runtime system.  ... 
doi:10.1109/espm2.2018.00009 dblp:conf/sc/TohidWSDSKAWIHB18 fatcat:nv3ak23sgzafxhc524r3nsdzau

Extended Abstract: Productive Parallel Programming with Parsl [article]

Kyle Chard, Yadu Babuji, Anna Woodard, Ben Clifford, Zhuozhao Li, Mihael Hategan, Ian Foster, Mike Wilde, Daniel S. Katz
2022 arXiv   pre-print
Developers can then link together functions via the exchange of data. Parsl establishes a dynamic dependency graph and sends tasks for execution on connected resources when dependencies are resolved.  ...  Parsl relies on developers annotating Python functions-wrapping either Python or external applications-to indicate that these functions may be executed concurrently.  ...  The graph is executed by translating the specification to code for a specific runtime system (e.g., in C++, Java, and .NET).  ... 
arXiv:2205.01527v1 fatcat:bli2ixefkrfabfpwzcmteyn3sa

torcpy: supporting task parallelism in Python

Panagiotis Hadjidoukas, Andrea Bartezzaghi, Florian Scheidegger, Roxana Istrate, Costas Bekas, Cristiano Malossi
2020 Zenodo  
Task-based parallelism has been established as one of the main forms of code parallelization, where asynchronous tasks are launched and distributed across the processing units of a local machine, a cluster  ...  In this work, we introduce torcpy, a platform-agnostic adaptive load balancing library that orchestrates the asynchronous execution of tasks, expressed as callables with arguments, on both shared and distributed  ...  The core runtime is implemented in Java and allows for tasks that include MPI code but does not support direct injection of MPI SPMD code in the taskparallel Python code.  ... 
doi:10.5281/zenodo.3985059 fatcat:rhp5on4g5ngyhc5hpgtdb4b3be

Velociraptor

Rahul Garg, Laurie Hendren
2014 Proceedings of the 23rd international conference on Parallel architectures and compilation - PACT '14  
Velociraptor provides an optimizing compiler toolkit for generating CPU and GPU code and also provides a smart runtime system to manage the GPU.  ...  To demonstrate Velociraptor in action, we present two proof-of-concept case studies: a GPU extension for a JIT implementation of MATLAB language, and a JIT compiler for Python targeting CPUs and GPUs.  ...  Their system required no annotations, and discovered sections of code suitable for execution on GPUs through profile-directed feedback.  ... 
doi:10.1145/2628071.2628097 dblp:conf/IEEEpact/GargH14 fatcat:b6qactysxnda3pe7enujb4vkkq

COMP Superscalar, an interoperable programming framework

Rosa M. Badia, Javier Conejero, Carlos Diaz, Jorge Ejarque, Daniele Lezzi, Francesc Lordan, Cristian Ramon-Cortes, Raul Sirvent
2015 SoftwareX  
A runtime system is in charge of exploiting the inherent concurrency of the code, automatically detecting and enforcing the data dependencies between tasks and spawning these tasks to the available resources  ...  For that purpose, it offers a simple programming model based on sequential development in which the user is mainly responsible for (i) identifying the functions to be executed as asynchronous parallel  ...  When the sequential code is executed, the COMPSs runtime intercepts the methods invocations and replaces them with calls to the runtime that create new asynchronous tasks.  ... 
doi:10.1016/j.softx.2015.10.004 fatcat:wdihysbvvzgythxojl4uu7fbg4

The dog programming language

Salman Ahmad, Sepandar Kamvar
2013 Proceedings of the 26th annual ACM symposium on User interface software and technology - UIST '13  
in Dog in a few lines of code.  ...  While these applications confer a number of benefits to their users, building them brings many challenges: manually managing state between asynchronous user actions, creating and maintaining separate code  ...  Python, Ruby, Javascript). As a result, these languages are unable to take advantage of multicore CPUs. Dog's Solution: Task-Based Concurrency Dog provides a concurrency primitive called tasks.  ... 
doi:10.1145/2501988.2502026 dblp:conf/uist/AhmadK13 fatcat:ew7fgxlmwjckpdbvk6b4yx72x4

torcpy: Supporting task parallelism in Python

P.E. Hadjidoukas, A. Bartezzaghi, F. Scheidegger, R. Istrate, C. Bekas, A.C.I. Malossi
2020 SoftwareX  
Task-based parallelism has been established as one of the main forms of code parallelization, where asynchronous tasks are launched and distributed across the processing units of a local machine, a cluster  ...  In this work, we introduce torcpy, a platform-agnostic adaptive load balancing library that orchestrates the asynchronous execution of tasks, expressed as callables with arguments, on both shared and distributed  ...  The core runtime is implemented in Java and allows for tasks that include MPI code but does not support direct injection of MPI SPMD code in the task-parallel Python code.  ... 
doi:10.1016/j.softx.2020.100517 fatcat:zghs427huje37po5iqvfjrma7e

NDL-v2.0: A new version of the numerical differentiation library for parallel architectures

P.E. Hadjidoukas, P. Angelikopoulos, C. Voglis, D.G. Papageorgiou, I.E. Lagaris
2014 Computer Physics Communications  
On multicore clusters, parallelism is exploited by means of TORC, an MPI-based multi-threaded tasking library.  ...  This allows sequential Python codes to exploit shared and distributed memory systems.  ... 
doi:10.1016/j.cpc.2014.04.002 fatcat:22p7akyem5cmhm5z23ktnxdhhu

FBBeam: An Erlang-based IEC 61499 Implementation

Laurin Prenzel, Julien Provost
2019 2019 IEEE 17th International Conference on Industrial Informatics (INDIN)  
Possible execution semantics are presented and compared to the Erlang execution model. An initial case study examines the scalability of a multi-tasking runtime environment.  ...  This paper aims to investigate the benefits of reusing an existing soft real-time runtime system for the implementation of the IEC 61499.  ...  The Erlang Runtime System (ERTS) is the virtual machine in which the compiled Erlang code is executed.  ... 
doi:10.1109/indin41052.2019.8972123 dblp:conf/indin/PrenzelP19 fatcat:ir7rech7xvhjvigai5uir3mwkm

Automatic Parallelization of Python Programs for Distributed Heterogeneous Computing [article]

Jun Shirako, Akihiro Hayashi, Sri Raj Paul, Alexey Tumanov, Vivek Sarkar
2022 arXiv   pre-print
This paper introduces a novel approach to automatic ahead-of-time (AOT) parallelization and optimization of sequential Python programs for execution on distributed heterogeneous platforms.  ...  Finally, the optimized output code is deployed using the Ray runtime for scheduling distributed tasks across multiple heterogeneous nodes in a cluster.  ...  Shirako, Hayashi, Paul, Tumanov, Sarkar hybrid Python/C++ code generation, fine-grained NumPy-to-CuPy conversion, and profile-based CPU/GPU runtime selection.  ... 
arXiv:2203.06233v1 fatcat:4e7sa6j3szgfri5pajrgccuvuu

Toward Interlanguage Parallel Scripting for Distributed-Memory Scientific Computing

Justin M. Wozniak, Timothy G. Armstrong, Ketan C. Maheshwari, Daniel S. Katz, Michael Wilde, Ian T. Foster
2015 2015 IEEE International Conference on Cluster Computing  
We present here a new approach to these problems in which the Swift scripting system is used to integrate high-level scripts written in Python, R, and Tcl, with native code developed in C, C++, and Fortran  ...  However, deploying scripted applications on large-scale parallel computer systems such as the IBM Blue Gene/Q or Cray XE6 is a challenge because of issues including operating system limitations, interoperability  ...  ACKNOWLEDGMENTS This material was based upon work supported by the U.S. Dept. of Energy, Office of Science, Office of Advanced Scientific Computing Research, under Contract DE-AC02-06CH11357.  ... 
doi:10.1109/cluster.2015.74 dblp:conf/cluster/WozniakAMKWF15 fatcat:kg2qojdvubbqhf75lhfzs6m2ae

Towards a Scalable and Distributed Infrastructure for Deep Learning Applications [article]

Bita Hasheminezhad, Shahrzad Shirzad, Nanmiao Wu, Patrick Diehl, Hannes Schulz, Hartmut Kaiser
2020 arXiv   pre-print
parallelism and concurrency (HPX), leveraging fine-grained threading and an active messaging task-based runtime system.  ...  Phylanx presents a productivity-oriented frontend where user Python code is translated to a futurized execution tree that can be executed efficiently on multiple nodes using the C++ standard library for  ...  Acknowledgements The authors are grateful for the support of this work by the LSU Center for Computation & Technology and by the DTIC project: Phylanx Engine Enhancement and Visualizations Development  ... 
arXiv:2010.03012v1 fatcat:2hy7evtvdra2dotv35dvbhv7mu

Redesigning OP2 Compiler to Use HPX Runtime Asynchronous Techniques [article]

Zahra Khatami, Hartmut Kaiser, J. Ramanujam
2017 arXiv   pre-print
One solution to address this challenge is the use of runtime methods. This strategy can be implemented by delaying certain amount of code analysis to be done at runtime.  ...  These optimizations include asynchronous tasking, loop interleaving, dynamic chunk sizing, and data prefetching.  ...  Acknowledgements We would like to thank Adrian Serio from Center for Computation and Technology at Louisiana State University for the invaluable and helpful comments and suggestions to improve the quality of  ... 
arXiv:1703.09264v1 fatcat:iwhar3uv5rfwxbz2mf7g7bkxlu

Balsam: Automated Scheduling and Execution of Dynamic, Data-Intensive HPC Workflows [article]

Michael A. Salim, Thomas D. Uram, J. Taylor Childers, Prasanna Balaprakash, Venkatram Vishwanath, Michael E. Papka
2019 arXiv   pre-print
We introduce the Balsam service to manage high-throughput task scheduling and execution on supercomputing systems.  ...  workflows, in which tasks are created or killed at runtime.  ...  Parsl [9] is a Python library that uses Swift/T as an execution backend for supercomputing environments, providing Python decorators to annotate existing codes for asynchronous, data-parallel execution  ... 
arXiv:1909.08704v1 fatcat:wqfpcf2vozbdzejvzqh3qrc75i

Parsl: Pervasive Parallel Programming in Python [article]

Yadu Babuji, Anna Woodard, Zhuozhao Li, Daniel S. Katz, Ben Clifford, Rohan Kumar, Lukasz Lacinski, Ryan Chard, Justin M. Wozniak, Ian Foster, Michael Wilde, Kyle Chard
2019 arXiv   pre-print
These constructs allow Parsl to construct a dynamic dependency graph of components that it can then execute efficiently on one or many processors.  ...  We show, via experiments on the Blue Waters supercomputer, that Parsl executors can allow Python scripts to execute components with as little as 5 ms of overhead, scale to more than 250 000 workers across  ...  It relied on the Blue Waters sustainedpetascale computing project, which is supported by the National Science Foundation (OCI-0725070, ACI-1238993) and the State of Illinois.  ... 
arXiv:1905.02158v1 fatcat:okcga7i4vza6zmx5lyj63seone
« Previous Showing results 1 — 15 out of 3,411 results