Filters








41 Hits in 4.5 sec

A course-based usability analysis of Cilk Plus and OpenMP

Michael Coblenz, Robert Seacord, Brad Myers, Joshua Sunshine, Jonathan Aldrich
2015 2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)  
Plus and OpenMP: split task into subtasks; assign subtasks to different threads 6 OpenMP vs.  ...  RESULTS Cilk Plus OpenMP Number of correct programs 5 3 Average speedup 1.5 1.2 Number of correct programs with speedup at least 1.5 4 2 16 CORRECTNESS 4/8 PERFORMANCE Speedup  ... 
doi:10.1109/vlhcc.2015.7357223 dblp:conf/vl/CoblenzSMSA15 fatcat:4qy6ocsmbvahpg2c5jkqwrcafu

D7.2.1 A Report on the Survey of HPC Tools and Techniques

Michael Lysaght, Bjorn Lindi, Vit Vondrak, John Donners, Marc Tajchman
2013 Zenodo  
This deliverable contains a comprehensive survey of the research activity undertaken within PRACE to date so as to better understand what HPC tools and techniques have been developed that could be successfully  ...  challenges of exascale computing and which have not yet b [...]  ...  In this sense, Cilk Plus is considerably more limited than OpenMP.  ... 
doi:10.5281/zenodo.6575492 fatcat:grwigpxd7naifbzo6w67w4glrm

D7.5: HPC Programming Techniques

Cevdet Aykanat, Antun Balaz, Iris Christadler, Ivan Girotto, Jose Gracia, Vladimir Slavnic, Andy Sunderland, Ata Türk
2012 Zenodo  
the hybridization of important user codes to test the mixed OpenMP and MPI programming model.  ...  Task 7.5 covered a plethora of different approaches and codes. The following deliverable is a summary of all projects performed within Task 7.5.  ...  Some of the titles have been changed during the course of the project.  ... 
doi:10.5281/zenodo.6552939 fatcat:z2gdhmnojrh6bj2lordyl7lmnq

Deterministic parallel random-number generation for dynamic-multithreading platforms

Charles E. Leiserson, Tao B. Schardl, Jim Sukha
2012 SIGPLAN notices  
We persuaded Intel to modify its commercial C/C++ compiler, which provides the Cilk Plus concurrency platform, to include pedigrees, and we built a library implementation of a deterministic parallel random-number  ...  Specifically, on a suite of 10 benchmarks, the relative overhead of Cilk with pedigrees to the original Cilk has a geometric mean of less than 1%.  ...  Dthreading concurrency platforms -including MIT Cilk [20] , Cilk++ [34] , Cilk Plus [28] , Fortress [1] , Habenero [2, 12] , Hood [6] , Java Fork/Join Framework [30] , OpenMP 3.0 [40] , Task Parallel  ... 
doi:10.1145/2370036.2145841 fatcat:rtml3ky2ojglppvude7l74l5mm

Deterministic parallel random-number generation for dynamic-multithreading platforms

Charles E. Leiserson, Tao B. Schardl, Jim Sukha
2012 Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming - PPoPP '12  
We persuaded Intel to modify its commercial C/C++ compiler, which provides the Cilk Plus concurrency platform, to include pedigrees, and we built a library implementation of a deterministic parallel random-number  ...  Specifically, on a suite of 10 benchmarks, the relative overhead of Cilk with pedigrees to the original Cilk has a geometric mean of less than 1%.  ...  Dthreading concurrency platforms -including MIT Cilk [20] , Cilk++ [34] , Cilk Plus [28] , Fortress [1] , Habenero [2, 12] , Hood [6] , Java Fork/Join Framework [30] , OpenMP 3.0 [40] , Task Parallel  ... 
doi:10.1145/2145816.2145841 dblp:conf/ppopp/LeisersonSS12 fatcat:lwlrymvnc5cabkk4ljn3ec5ovq

Exploring Traditional and Emerging Parallel Programming Models Using a Proxy Application

Ian Karlin, Abhinav Bhatele, Jeff Keasler, Bradford L. Chamberlain, Jonathan Cohen, Zachary Devito, Riyaz Haque, Dan Laney, Edward Luke, Felix Wang, David Richards, Martin Schulz (+1 others)
2013 2013 IEEE 27th International Symposium on Parallel and Distributed Processing  
In this paper, we compare several implementations of LULESH, a proxy application for shock hydrodynamics, to determine strengths and weaknesses of different programming models for parallel computation.  ...  We focus on four traditional (OpenMP, MPI, MPI+OpenMP, CUDA) and four emerging (Chapel, Charm++, Liszt, Loci) programming models.  ...  The views and opinions of authors expressed herein do not necessarily state or reflect those of the U.S. government or LLNS, and shall not be used for advertising or product endorsement purposes.  ... 
doi:10.1109/ipdps.2013.115 dblp:conf/ipps/KarlinBKCCDHLLWRSS13 fatcat:bgw5qt4punhwzgc3oqm3o7a6ya

D7.3: Inventory of Exascale Tools and Techniques

Nicola Mc Donnell
2016 Zenodo  
analysis and exploitation phase.  ...  A questionnaire was designed, which was circulated to through a Point of Contact (PoC) for each CoE. The analysed findings are the subject of this document.  ...  Acknowledgements The authors would like to acknowledge and thank the Centres of Excellence for their cooperation with and contribution to this deliverable.  ... 
doi:10.5281/zenodo.6801725 fatcat:ez63t2znsvdcpnvijzi4c74dc4

vSMC: Parallel Sequential Monte Carlo inC++

Yan Zhou
2014 Journal of Statistical Software  
Sequential Monte Carlo is a family of algorithms for sampling from a sequence of distributions.  ...  Two examples are presented: a simple particle filter and a classic Bayesian modeling problem.  ...  We consider five different implementations supported by the Intel C++ Compiler 2013: sequential, Intel TBB, Cilk Plus, OpenMP and C++11 <thread>.  ... 
doi:10.18637/jss.v062.i09 fatcat:nrhvzziyxvck5lznupged6knnq

Real-time system support for hybrid structural simulation

David Ferry, Gregory Bunting, Amin Maghareh, Arun Prakash, Shirley Dyke, Kunal Agrawal, Chris Gill, Chenyang Lu
2014 Proceedings of the 14th International Conference on Embedded Software - EMSOFT '14  
We execute large numerical simulations within tight timing constraints and provide a reasonable assurance of timeliness and usability.  ...  Instead, a hybrid testing framework connects part of a physical structure within a closed loop (through sensors and actuators) to a numerical simulation of the rest of the structure.  ...  such as OpenMP [8] or Cilk Plus [7] .  ... 
doi:10.1145/2656045.2656067 dblp:conf/emsoft/FerryBMPDAGL14 fatcat:obyvr23lfbfyhi4gs4y3b7zmue

Runtime Adaptation for Autonomic Heterogeneous Computing

Thomas R.W. Scogland, Wu-Chun Feng
2014 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing  
accelerator based systems as well as a synthesis of the two to address multiple levels of heterogeneity as a coherent whole.  ...  More quietly it is increasing with the rise of NUMA systems, hierarchical caching, OS noise, and a myriad of other factors.  ...  The information analysis component performs all analysis and summarization of the data.  ... 
doi:10.1109/ccgrid.2014.23 dblp:conf/ccgrid/ScoglandF14 fatcat:eat26fkykvebnfeqpuuwhhbfbu

Observationally Cooperative Multithreading [article]

Christopher A. Stone, Melissa E. O'Neill, Sonja A. Bohr, Adam M. Cozzette, M. Joe DeBlasio, Julia Matsieva, Stuart A. Pernsteiner, Ari D. Schumer
2015 arXiv   pre-print
Implementers and researchers also benefit from the agnostic nature of OCM -- it provides a level of abstraction to investigate, compare, and combine a variety of interesting concurrency-control techniques  ...  and choose the one that offers the best performance.  ...  , Claire Connelly, and Robert Keller for helpful comments on this paper.  ... 
arXiv:1502.05094v1 fatcat:poj7ejgatzd7lgrwhcmfnff5pu

vSMC: Parallel Sequential Monte Carlo in C++ [article]

Yan Zhou
2013 arXiv   pre-print
Sequential Monte Carlo is a family of algorithms for sampling from a sequence of distributions.  ...  Two examples are presented: a simple particle filter and a classic Bayesian modeling problem.  ...  We consider five different implementations supported by Intel C++ Complier 2013: sequential, Intel TBB, Cilk Plus, OpenMP and C++11 <thread>.  ... 
arXiv:1306.5583v1 fatcat:dqwck4cocbggjlox6hg4ooqyqy

Addressing Application Bottlenecks: Microarchitecture [chapter]

Alexander Supalov, Andrey Semin, Michael Klemm, Christopher Dahnken
2014 Optimizing HPC Applications with Intel® Cluster Tools  
In this chapter, we outline some of the general design principles of modern processors that will allow you to understand the do's and don'ts of diagnosing bottlenecks and to exploit tools to extract the  ...  Furthermore, a certain understanding of assembly language is needed to reflect the findings back onto the original source code.  ...  #include Data Rearrangement Of course, those few intrinsics are not all there are; in few cases do we get data presented so readily usable, as with a matrix multiplication.  ... 
doi:10.1007/978-1-4302-6497-2_7 fatcat:gx3xowvqzveedbcxjltivscudi

Parallel Real-Time Scheduling for Latency-Critical Applications

Jing Li
2017
All of these would benefit me for my academic career and the rest of life.  ...  During my graduate study, I have received enormous help and support from many people, and this thesis would not have been possible without all of them.  ...  of Cholesky, LU, and Heat for OpenMP and Cilk Plus implementations (in seconds) and the ratio of the maximum execution times of Cilk Plus over OpenMP implementations. 1 target latency.  ... 
doi:10.7936/k7b27tpk fatcat:hrhhktu54zdczjd73sfw6demdu

Pico: A Domain-Specific Language For Data Analytics Pipelines

Claudia Misale, Marco Aldinucci, Guy Tremblay
2017 Zenodo  
As result of this analysis, we provide a layered model that can represent tools and applications following the Dataflow paradigm and we show how the analyzed tools fit in each level.  ...  This analysis can be considered as a first step toward a formal model to be exploited in the design of a (new) framework for Big Data analytics.  ...  Acknowledgements Funding This work has been partially supported by the Italian Ministry of Education and Research (MIUR), by the EU-H2020 RIA project "Toreador" (no. 688797), the EU-H2020 RIA project  ... 
doi:10.5281/zenodo.579753 fatcat:aadje57qh5hk3ijmqn4j7vkhpm
« Previous Showing results 1 — 15 out of 41 results