Filters








26 Hits in 6.9 sec

Quantifying Overheads in Charm++ and HPX using Task Bench [article]

Nanmiao Wu and Ioannis Gonidelis and Simeng Liu and Zane Fink and Nikunj Gupta and Karame Mohammadiporshokooh and Patrick Diehl and Hartmut Kaiser and Laxmikant V. Kale
2022 arXiv   pre-print
were implemented, e.g., MPI, OpenMP, MPI + OpenMP, and extend Task Bench by adding HPX implementations.  ...  In this paper, we present the comparison of the AMT systems Charm++ and HPX with the main stream MPI, OpenMP, and MPI+OpenMP libraries using the Task Bench benchmarks.  ...  Here, we need to investigate the differences with respect to MPI and do some profiling with the tools provided by both AMTs.  ... 
arXiv:2207.12127v1 fatcat:kpmd3c2u7vgmhka4ewlb5aa5by

Towards a Scalable and Distributed Infrastructure for Deep Learning Applications [article]

Bita Hasheminezhad, Shahrzad Shirzad, Nanmiao Wu, Patrick Diehl, Hannes Schulz, Hartmut Kaiser
2020 arXiv   pre-print
parallelism and concurrency (HPX), leveraging fine-grained threading and an active messaging task-based runtime system.  ...  Phylanx presents a productivity-oriented frontend where user Python code is translated to a futurized execution tree that can be executed efficiently on multiple nodes using the C++ standard library for  ...  Acknowledgements The authors are grateful for the support of this work by the LSU Center for Computation & Technology and by the DTIC project: Phylanx Engine Enhancement and Visualizations Development  ... 
arXiv:2010.03012v1 fatcat:2hy7evtvdra2dotv35dvbhv7mu

A Comparative Study of Asynchronous Many-Tasking Runtimes: Cilk, Charm++, ParalleX and AM++ [article]

Abhishek Kulkarni, Andrew Lumsdaine
2019 arXiv   pre-print
We compare along three bases: programming model, execution model and the implementation on an underlying machine model.  ...  We evaluate and compare four contemporary and emerging runtimes for high-performance computing(HPC) applications: Cilk, Charm++, ParalleX and AM++.  ...  Execution Model: AM++ and Implementation AM++ is an active message framework based on generic programming techniques.  ... 
arXiv:1904.00518v1 fatcat:euvfhakryzcbdmhpbrrxrfu6he

Code modernization strategies for short-range non-bonded molecular dynamics simulations [article]

James Vance, Zhen-Hao Xu, Nikita Tretyakov, Torsten Stuehn, Markus Rampp, Sebastian Eibl, Christoph Junghans, André Brinkmann
2022 arXiv   pre-print
We also implement fine-grained parallelism for multi-core CPUs through HPX, a C++ runtime system which uses lightweight threads and an asynchronous many-task approach to maximize concurrency.  ...  techniques to benefit the calculation of short-range non-bonded forces, which results in an overall three times speedup and serves as a baseline for further optimizations.  ...  for High Performance Computing in Rhineland Palatinate, https://www.ahrp.info) and the Gauss Alliance.  ... 
arXiv:2109.10876v2 fatcat:la66flxqwremzhplztr6cd2yae

Scalable Data Management of the Uintah Simulation Framework for Next-Generation Engineering Problems with Radiation [chapter]

Sidharth Kumar, Alan Humphrey, Will Usher, Steve Petruzza, Brad Peterson, John A. Schmidt, Derek Harris, Ben Isaac, Jeremy Thornock, Todd Harman, Valerio Pascucci, Martin Berzins
2018 Lecture Notes in Computer Science  
In this paper we present a simple to implement, restructuring based parallel I/O technique. We impose a restructuring step that alters the distribution of data among processes.  ...  Uintah provides a highly scalable asynchronous many-task runtime system, which in this work is used for the modeling of a 1000 megawatt electric (MWe) ultra-supercritical (USC) coal boiler.  ...  An award of computer time was provided by the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program.  ... 
doi:10.1007/978-3-319-69953-0_13 fatcat:lodqec3lujdxrmd3qphlmds7cq

A Task-Based Parallel Rendering Component For Large-Scale Visualization Applications [article]

Tim Biedert, Kilian Werner, Bernd Hentschel, Christoph Garth
2017 Eurographics Symposium on Parallel Graphics and Visualization  
We conduct comprehensive benchmarks to verify the characteristics and potential of our novel task-based system design for high-performance visualization.  ...  We demonstrate a flexible parallel rendering framework built upon a task-based dynamic runtime environment enabling adaptable performance-oriented deployment on various platform configurations.  ...  Acknowledgement This research was funded in part by the German Research Foundation (DFG) within the IRTG 2057 "Physical Modeling for Virtual Manufacturing Systems and Processes".  ... 
doi:10.2312/pgv.20171094 dblp:conf/egpgv/BiedertWHG17 fatcat:6xtxcxv5izgghph3o2e3rq4enm

A Smoothed Particle Hydrodynamics Mini-App for Exascale [article]

Aurélien Cavelan, Rubén M. Cabezón, Michal Grabarczyk, Florina M. Ciorba
2020 arXiv   pre-print
In this work, we review the status of a novel SPH-EXA mini-app, which is the result of an interdisciplinary co-design project between the fields of astrophysics, fluid dynamics and computer science, whose  ...  Parallelism is expressed via multiple programming models, which can be chosen at compilation time with or without accelerator support, for a hybrid process+thread+accelerator configuration.  ...  Several performance results were provided by the Performance Optimisation and Productivity (POP) centre of excellence in HPC.  ... 
arXiv:2005.02656v1 fatcat:qr6tq4d6vjbftf37hsqbg7s4b4

Architectural specification for massively parallel computers: an experience and measurement-based approach

Ron Brightwell, William Camp, Benjamin Cole, Erik DeBenedictis, Robert Leland, James Tomkins, Arthur B. Maccabe
2005 Concurrency and Computation  
We discuss the evolution of this architecture and provide reasons for the different choices that have been made.  ...  We contrast our approach of leveraging high-volume, mass-market commodity processors to that taken for the Earth Simulator.  ...  The maximum throughput and latency for an MPI put operation is 11.63 GB/s and 6.63 µs, respectively.  ... 
doi:10.1002/cpe.893 fatcat:pzr2kymiajeshassp7oiinr2oi

Hierarchical Parallelism for Transient Solid Mechanics Simulations

D. Littlewood, R. Jones, N. Morales, J. Plews, U. Hetmaniuk, J. Lifflander
2021 14th WCCM-ECCOMAS Congress   unpublished
While these hardware advancements have the potential to significantly reduce turnaround times, they also present implementation and design challenges for engineering codes.  ...  We investigate the use of two strategies to mitigate these challenges: the Kokkos library for performance portability across disparate architectures, and the DARMA/vt library for asynchronous many-task  ...  ., for the U.S.  ... 
doi:10.23967/wccm-eccomas.2020.164 fatcat:ttxncxtezvaydc7jy5jorgmhna

29th International Conference on Data Engineering [book of abstracts]

2013 2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW)  
Our method is more scalable than an MPI algorithm, and is simpler and more fault tolerant.  ...  We have implemented these ThU/11 three designs in SQL Server 2012. For each design, both the write-through and write-back SSD caching policies were implemented.  ...  These volunteers welcome participants, give directions, help in the sessions and on the registration desk, and generally make sure the conference is running smoothly.  ... 
doi:10.1109/icdew.2013.6547409 fatcat:wadzpuh3b5htli4mgb4jreoika

Industrial and Project Presentations [article]

Felipe A. Lozano, Francisco Serón
2003 Eurographics State of the Art Reports  
This volume contains the Industrial and Project Presentations for the 24th annual Conference of the European Association for Computer Graphics, EUROGRAPHICS´03, held in Granada, Spain, between the 1st  ...  and 6th of September 2003.  ...  We thank GAMA-team for their work and collaboration.  ... 
doi:10.2312/egid.20031006 fatcat:loh7chebubg2vatd5syfugub24

Exploiting graphical processing units for data-parallel scientific applications

A. Leist, D. P. Playne, K. A. Hawick
2009 Concurrency and Computation  
As well as reporting speed-up performance on selected simulation paradigms, we discuss suitable data-parallel algorithms and present code examples for exploiting GPU features like large numbers of threads  ...  We find a surprising variation in the performance that can be achieved on GPUs for our applications and discuss how these findings relate to past known effects in parallel computing such as memory speed-related  ...  Bond and C.J. Scogings for helpful discussions on GPUs and on scientific programming.  ... 
doi:10.1002/cpe.1462 fatcat:zdr3r4kn25dqpl42ugle5qprea

On-Line Chemistry Within WRF: Description and Evaluation of a State-of-the-Art Multiscale Air Quality and Weather Prediction Model [chapter]

Georg Grell, Jerome Fast, William I. Gustafson, Steven E. Peckham, Stuart McKeen, Marc Salzmann, Saulo Freitas
2010 Integrated Systems of Meso-Meteorological and Chemical Transport Models  
This study was supported by an internal special project fund of the Japan Agency for Marine-Earth Science and Technology (JAMSTEC).  ...  Chapman, and G.A. Grell (2006), Evolution of ozone, particulates, and aerosol direct forcing in an urban area using a new fully-coupled meteorology, chemistry, and aerosol model, J. Geophys.  ...  Currently, only 2D and 3D nearest-neighbour, 2D and 3D linear, and bi-cubic regridding, and 2D conservative remapping techniques are implemented, but there are plans to implement also 3D cubic grid interpolation  ... 
doi:10.1007/978-3-642-13980-2_3 fatcat:wwygpb4jnngvthp33ana5ps6li

BOOKLET_ASHPC22

Eduard Reiter
2022
EuroHPC implements an ambitious research and innovation program which covers all aspects of the HPCdomain: development of novel technologies, advancement of algorithms and applications, supporting applications  ...  EuroHPC's mandate was renewed in 2021 [2] through an updated and expanded program and a committed budget from the European Union that will exceed 7 billion Euro.  ...  The financial support by the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development and the Christian Doppler Research Association  ... 
doi:10.25365/phaidra.337 fatcat:uq5s5jm3ifhypd4clbtdhrz6ji

Evaluation and Usability of Programming Languages and Tools (PLATEAU) PLATEAU 2009

Craig Anslow, Shane Markstrum, Emerson Murphy-Hill
2010 unpublished
The same broad principles and specific techniques of sound interaction design apply.  ...  One of the pioneers of modern software engineering, he is an award-winning designer and author, recipient of the 2009 Stevens Award for his contributions to design and design methods, and a Fellow of the  ...  Acknowledgments I am extremely grateful to Luke Church and Alan Blackwell for their assistance, suggestions and feedback while designing and performing this experiment, as well as to my supervisor, Simon  ... 
fatcat:nb4ev2loh5cnzf4um3iisgzxem
« Previous Showing results 1 — 15 out of 26 results