A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2014; you can also visit the original URL.
The file type is application/pdf
.
Filters
Cilk: An Efficient Multithreaded Runtime System
1996
Journal of Parallel and Distributed Computing
Cilk (pronounced "silk") is a C-based runtime system for multithreaded parallel programming. ...
The Cilk runtime system currently runs on the Connection Machine CM5 MPP, the Intel Paragon MPP, the Silicon Graphics Power Challenge SMP, and the MIT Phish network of workstations. ...
Mike's PCM runtime system [18] ...
doi:10.1006/jpdc.1996.0107
fatcat:ccmvsopyqjgrthmjxehw5yr75y
Cilk
1995
SIGPLAN notices
Cilk (pronounced "silk") is a C-based runtime system for multithreaded parallel programming. ...
In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and analytically. ...
Cilk is a runtime system whose work-stealing scheduler is efficient in theory as well as in practice. ...
doi:10.1145/209937.209958
fatcat:2vcgudk5qraljetz2zbbp3cgr4
The Cilk++ concurrency platform
2009
Proceedings of the 46th Annual Design Automation Conference on ZZZ - DAC '09
The Cilk++ runtime system guarantees to load-balance computations effectively. ...
This paper overviews the Cilk++ programming environment, which incorporates a compiler, a runtime system, and a race-detection tool. ...
Acknowledgments Many thanks to the great team at Cilk Arts and to our many customers who have helped us refine the Cilk++ system. ...
doi:10.1145/1629911.1630048
dblp:conf/dac/Leiserson09
fatcat:5oenlyp7gvfidgh2snrrik7vdi
Reducers and other Cilk++ hyperobjects
2009
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures - SPAA '09
language that enables multicore programming in the style of MIT Cilk. ...
This paper introduces hyperobjects, a linguistic mechanism that allows different branches of a multithreaded program to maintain coordinated local views of the same nonlocal variable. ...
ACKNOWLEDGMENTS A big thanks to the great engineering team at Cilk Arts. Thanks especially to John Carr, who contributed mightily to the implementation and optimization of reducers. ...
doi:10.1145/1583991.1584017
dblp:conf/spaa/FrigoHLL09
fatcat:jxogl7xnunht5oo3ikldllcb3i
A Comparative Study of Asynchronous Many-Tasking Runtimes: Cilk, Charm++, ParalleX and AM++
[article]
2019
arXiv
pre-print
We evaluate and compare four contemporary and emerging runtimes for high-performance computing(HPC) applications: Cilk, Charm++, ParalleX and AM++. ...
We also evaluate four mature implementations of these runtimes, namely: Intel Cilk++, Charm++ 6.5.1, AM++ and HPX, that embody the principles dictated by these models. ...
The Cilk multithreaded runtime system originally developed for the Connection Machine CM5 had support for distributed shared memory implemented in software. ...
arXiv:1904.00518v1
fatcat:euvfhakryzcbdmhpbrrxrfu6he
Efficient Detection of Determinacy Races in Cilk Programs
1999
Theory of Computing Systems
We have implemented a provably efficient determinacy-race detector for Cilk, an algorithmic multithreaded programming language. ...
The core of the Nondeterminator is an asymptotically efficient serial algorithm (inspired by Tarjan's nearly linear-time leastcommon-ancestors algorithm) for detecting determinacy races in series-parallel ...
of MIT-provided many helpful suggestions and donated their Cilk application programs for testing. ...
doi:10.1007/s002240000120
fatcat:ftropx35j5cdlgxj6irodqxbcq
Efficiently Detecting Races in Cilk Programs That Use Reducer Hyperobjects
2015
Proceedings of the 27th ACM on Symposium on Parallelism in Algorithms and Architectures - SPAA '15
in how the Cilk runtime system manages a reducer. ...
A multithreaded Cilk program that is ostensibly deterministic may nevertheless behave nondeterministically due to programming errors in the code. ...
Because the Cilk runtime system creates and reduces views based on scheduling, such a read can cause multiple runs of the same Cilk program to produce di↵erent results. ...
doi:10.1145/2755573.2755599
dblp:conf/spaa/LeeS15
fatcat:aet5nakznrbx3nqqg7r4ehgc3e
Efficient detection of determinacy races in Cilk programs
1997
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures - SPAA '97
We have implemented a provabl y efficient determinacy-race detector for Cilk, an algorithmic multithreaded programming language. ...
The core of the Nondeterrninator is an asymptotically efficient serial algorithm (inspired by Tarjan's nearly linear-time leastcommon-ancestors algorithm) for detecting deterrninacy races in series-parallel ...
of MIT-provided many helpful suggestions and donated their Cilk application programs for testing. ...
doi:10.1145/258492.258493
dblp:conf/spaa/FengL97
fatcat:sju3dflf4rcupnxy2dzv3qnr7q
Language-based vectorization and parallelization using intrinsics, OpenMP, TBB and Cilk Plus
2018
Journal of Supercomputing
The aim of this paper is to evaluate OpenMP, TBB and Cilk Plus as basic language-based tools for simple and efficient parallelization of recursively defined computational problems and other problems that ...
Manual vectorization techniques based on Cilk array notation and intrinsics are presented. We also show how to simplify such optimization using Intel SIMD Data Layout Template containers. ...
In case of TBB and Cilk, the runtime system has been responsible for load balancing. The parallel loops in the Cilk version have been parallelized using _Cilk_for construct. ...
doi:10.1007/s11227-017-2231-3
fatcat:gr7vg37emzg75edfpqrsmbk7tu
Intel Cilk Plus for complex parallel algorithms: "Enormous Fast Fourier Transforms" (EFFT) library
2015
Parallel Computing
This work provides a new efficient DFFT implementation, and at the same time demonstrates an educational example of how computer science problems with complex parallel patterns can be optimized for high ...
performance using the Intel Cilk Plus framework. ...
For instance, if we had T = 24 workers and b = 32 bins, the runtime system would first distribute 24 bins across the 24 workers. ...
doi:10.1016/j.parco.2015.05.004
fatcat:chdit7gmkvfhthvqvzts22ygru
Multi-Core Program Optimization: Parallel Sorting Algorithms in Intel Cilk Plus
2014
International Journal of Hybrid Information Technology
Intel Cilk Plus is a C based computing system that presents a straight forward and well-structured model for the development, verification and analysis of multicore and parallel programming. ...
New performance leaps has been achieved with multiprogramming and multi-core systems. ...
It informs the runtime system that spawned function can run in parallel. Cilk_sync keyword is used to wait for the spawned procedures to complete. ...
doi:10.14257/ijhit.2014.7.2.15
fatcat:agv7ks7okfcndktyarryn3thye
Parallel fast Fourier transform in SPMD style of Cilk
2019
International Journal of Embedded Systems
As a highly compact designed code, this code is compared with a highly tuned parallel recursive fast Fourier transform (FFT) using Cilk, which is included in Cilk package of version 5.4.6. ...
In this paper, we propose a parallel one-dimensional non-recursive fast Fourier transform (FFT) program based on conventional Cooley-Tukey's algorithm written in C using Cilk in single program multiple ...
Parallel fast Fourier transform in SPMD style of Cilk ...
doi:10.1504/ijes.2019.103998
fatcat:xwsgbgzdb5bctcuw5cq4w7twgq
Comparing Parallel Simulation of Social Agents Using Cilk and OpenCL
2011
2011 IEEE/ACM 15th International Symposium on Distributed Simulation and Real Time Applications
Simulation efficiency for two realistic models with varying complexity on a scale of 10 7 agents has shown the usefulness of both approaches. ...
To this end, we have performed simulation runs with parameter variation on a real parallel and distributed hardware platform using Cilk as well as on a GPU employing OpenCL. ...
Hence, the complexity of the system increases exponentially with an increase in the number of entities (humans, devices) in the system. ...
doi:10.1109/ds-rt.2011.12
dblp:conf/dsrt/MoserRZF11
fatcat:u2ppzoiz2vhxpihddfei3oveju
Factory: An Object-Oriented Parallel Programming Substrate for Deep Multiprocessors
[chapter]
2005
Lecture Notes in Computer Science
These processors are used as building blocks of shared-memory multiprocessor systems, or clusters of multiprocessors. ...
Moreover, Factory offers programmability and performance comparable to already established multithreading substrates. ...
Cilk [8] is an extension to C with explicit support for multithreaded programming. ...
doi:10.1007/11557654_28
fatcat:kd7m5pwvxfc35pt4bv2z3sfsqe
Memory-mapping support for reducer hyperobjects
2012
Proceedinbgs of the 24th ACM symposium on Parallelism in algorithms and architectures - SPAA '12
We replaced the Intel Cilk Plus runtime system with our own Cilk-M runtime system which uses TLMM to implement a reducer mechanism that supports a reducer lookup using only two memory accesses and a predictable ...
An empirical evaluation shows that the Cilk-M memory-mapping approach is close to 4× faster than the Cilk Plus hypermap approach. ...
Thanks to Pablo Halpern of Intel, one of the original designers of reducers and a Cilk Plus developer, for helpful discussions on the implementation of reducers in Cilk Plus. ...
doi:10.1145/2312005.2312056
dblp:conf/spaa/LeeSL12
fatcat:ewpy3jsrczbqdb76gfgilh5jvu
« Previous
Showing results 1 — 15 out of 643 results