A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
Filters
A course-based usability analysis of Cilk Plus and OpenMP
2015
2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)
Plus and OpenMP: split task into subtasks; assign
subtasks to different threads
6
OpenMP vs. ...
RESULTS
Cilk Plus
OpenMP
Number of correct programs
5
3
Average speedup
1.5
1.2
Number of correct programs with
speedup at least 1.5
4
2
16
CORRECTNESS
4/8 PERFORMANCE
Speedup ...
doi:10.1109/vlhcc.2015.7357223
dblp:conf/vl/CoblenzSMSA15
fatcat:4qy6ocsmbvahpg2c5jkqwrcafu
D7.2.1 A Report on the Survey of HPC Tools and Techniques
2013
Zenodo
This deliverable contains a comprehensive survey of the research activity undertaken within PRACE to date so as to better understand what HPC tools and techniques have been developed that could be successfully ...
challenges of exascale computing and which have not yet b [...] ...
In this sense, Cilk Plus is considerably more limited than OpenMP. ...
doi:10.5281/zenodo.6575492
fatcat:grwigpxd7naifbzo6w67w4glrm
D7.5: HPC Programming Techniques
2012
Zenodo
the hybridization of important user codes to test the mixed OpenMP and MPI programming model. ...
Task 7.5 covered a plethora of different approaches and codes. The following deliverable is a summary of all projects performed within Task 7.5. ...
Some of the titles have been changed during the course of the project. ...
doi:10.5281/zenodo.6552939
fatcat:z2gdhmnojrh6bj2lordyl7lmnq
Deterministic parallel random-number generation for dynamic-multithreading platforms
2012
SIGPLAN notices
We persuaded Intel to modify its commercial C/C++ compiler, which provides the Cilk Plus concurrency platform, to include pedigrees, and we built a library implementation of a deterministic parallel random-number ...
Specifically, on a suite of 10 benchmarks, the relative overhead of Cilk with pedigrees to the original Cilk has a geometric mean of less than 1%. ...
Dthreading concurrency platforms -including MIT Cilk [20] , Cilk++ [34] , Cilk Plus [28] , Fortress [1] , Habenero [2, 12] , Hood [6] , Java Fork/Join Framework [30] , OpenMP 3.0 [40] , Task Parallel ...
doi:10.1145/2370036.2145841
fatcat:rtml3ky2ojglppvude7l74l5mm
Deterministic parallel random-number generation for dynamic-multithreading platforms
2012
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming - PPoPP '12
We persuaded Intel to modify its commercial C/C++ compiler, which provides the Cilk Plus concurrency platform, to include pedigrees, and we built a library implementation of a deterministic parallel random-number ...
Specifically, on a suite of 10 benchmarks, the relative overhead of Cilk with pedigrees to the original Cilk has a geometric mean of less than 1%. ...
Dthreading concurrency platforms -including MIT Cilk [20] , Cilk++ [34] , Cilk Plus [28] , Fortress [1] , Habenero [2, 12] , Hood [6] , Java Fork/Join Framework [30] , OpenMP 3.0 [40] , Task Parallel ...
doi:10.1145/2145816.2145841
dblp:conf/ppopp/LeisersonSS12
fatcat:lwlrymvnc5cabkk4ljn3ec5ovq
Exploring Traditional and Emerging Parallel Programming Models Using a Proxy Application
2013
2013 IEEE 27th International Symposium on Parallel and Distributed Processing
In this paper, we compare several implementations of LULESH, a proxy application for shock hydrodynamics, to determine strengths and weaknesses of different programming models for parallel computation. ...
We focus on four traditional (OpenMP, MPI, MPI+OpenMP, CUDA) and four emerging (Chapel, Charm++, Liszt, Loci) programming models. ...
The views and opinions of authors expressed herein do not necessarily state or reflect those of the U.S. government or LLNS, and shall not be used for advertising or product endorsement purposes. ...
doi:10.1109/ipdps.2013.115
dblp:conf/ipps/KarlinBKCCDHLLWRSS13
fatcat:bgw5qt4punhwzgc3oqm3o7a6ya
D7.3: Inventory of Exascale Tools and Techniques
2016
Zenodo
analysis and exploitation phase. ...
A questionnaire was designed, which was circulated to through a Point of Contact (PoC) for each CoE. The analysed findings are the subject of this document. ...
Acknowledgements The authors would like to acknowledge and thank the Centres of Excellence for their cooperation with and contribution to this deliverable. ...
doi:10.5281/zenodo.6801725
fatcat:ez63t2znsvdcpnvijzi4c74dc4
vSMC: Parallel Sequential Monte Carlo inC++
2014
Journal of Statistical Software
Sequential Monte Carlo is a family of algorithms for sampling from a sequence of distributions. ...
Two examples are presented: a simple particle filter and a classic Bayesian modeling problem. ...
We consider five different implementations supported by the Intel C++ Compiler 2013: sequential, Intel TBB, Cilk Plus, OpenMP and C++11 <thread>. ...
doi:10.18637/jss.v062.i09
fatcat:nrhvzziyxvck5lznupged6knnq
Real-time system support for hybrid structural simulation
2014
Proceedings of the 14th International Conference on Embedded Software - EMSOFT '14
We execute large numerical simulations within tight timing constraints and provide a reasonable assurance of timeliness and usability. ...
Instead, a hybrid testing framework connects part of a physical structure within a closed loop (through sensors and actuators) to a numerical simulation of the rest of the structure. ...
such as OpenMP [8] or Cilk Plus [7] . ...
doi:10.1145/2656045.2656067
dblp:conf/emsoft/FerryBMPDAGL14
fatcat:obyvr23lfbfyhi4gs4y3b7zmue
Runtime Adaptation for Autonomic Heterogeneous Computing
2014
2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
accelerator based systems as well as a synthesis of the two to address multiple levels of heterogeneity as a coherent whole. ...
More quietly it is increasing with the rise of NUMA systems, hierarchical caching, OS noise, and a myriad of other factors. ...
The information analysis component performs all analysis and summarization of the data. ...
doi:10.1109/ccgrid.2014.23
dblp:conf/ccgrid/ScoglandF14
fatcat:eat26fkykvebnfeqpuuwhhbfbu
Observationally Cooperative Multithreading
[article]
2015
arXiv
pre-print
Implementers and researchers also benefit from the agnostic nature of OCM -- it provides a level of abstraction to investigate, compare, and combine a variety of interesting concurrency-control techniques ...
and choose the one that offers the best performance. ...
, Claire Connelly, and Robert Keller for helpful comments on this paper. ...
arXiv:1502.05094v1
fatcat:poj7ejgatzd7lgrwhcmfnff5pu
vSMC: Parallel Sequential Monte Carlo in C++
[article]
2013
arXiv
pre-print
Sequential Monte Carlo is a family of algorithms for sampling from a sequence of distributions. ...
Two examples are presented: a simple particle filter and a classic Bayesian modeling problem. ...
We consider five different implementations supported by Intel C++ Complier 2013: sequential, Intel TBB, Cilk Plus, OpenMP and C++11 <thread>. ...
arXiv:1306.5583v1
fatcat:dqwck4cocbggjlox6hg4ooqyqy
Addressing Application Bottlenecks: Microarchitecture
[chapter]
2014
Optimizing HPC Applications with Intel® Cluster Tools
In this chapter, we outline some of the general design principles of modern processors that will allow you to understand the do's and don'ts of diagnosing bottlenecks and to exploit tools to extract the ...
Furthermore, a certain understanding of assembly language is needed to reflect the findings back onto the original source code. ...
#include
Data Rearrangement Of course, those few intrinsics are not all there are; in few cases do we get data presented so readily usable, as with a matrix multiplication. ...
doi:10.1007/978-1-4302-6497-2_7
fatcat:gx3xowvqzveedbcxjltivscudi
Parallel Real-Time Scheduling for Latency-Critical Applications
2017
All of these would benefit me for my academic career and the rest of life. ...
During my graduate study, I have received enormous help and support from many people, and this thesis would not have been possible without all of them. ...
of Cholesky, LU, and Heat for OpenMP and Cilk Plus implementations (in seconds) and the ratio of the maximum execution times of Cilk Plus over OpenMP implementations.
1 target latency. ...
doi:10.7936/k7b27tpk
fatcat:hrhhktu54zdczjd73sfw6demdu
Pico: A Domain-Specific Language For Data Analytics Pipelines
2017
Zenodo
As result of this analysis, we provide a layered model that can represent tools and applications following the Dataflow paradigm and we show how the analyzed tools fit in each level. ...
This analysis can be considered as a first step toward a formal model to be exploited in the design of a (new) framework for Big Data analytics. ...
Acknowledgements
Funding This work has been partially supported by the Italian Ministry of Education and Research (MIUR), by the EU-H2020 RIA project "Toreador" (no. 688797), the EU-H2020 RIA project ...
doi:10.5281/zenodo.579753
fatcat:aadje57qh5hk3ijmqn4j7vkhpm
« Previous
Showing results 1 — 15 out of 41 results