Filters








643 Hits in 3.4 sec

Cilk: An Efficient Multithreaded Runtime System

Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou
1996 Journal of Parallel and Distributed Computing  
Cilk (pronounced "silk") is a C-based runtime system for multithreaded parallel programming.  ...  The Cilk runtime system currently runs on the Connection Machine CM5 MPP, the Intel Paragon MPP, the Silicon Graphics Power Challenge SMP, and the MIT Phish network of workstations.  ...  Mike's PCM runtime system [18]  ... 
doi:10.1006/jpdc.1996.0107 fatcat:ccmvsopyqjgrthmjxehw5yr75y

Cilk

Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou
1995 SIGPLAN notices  
Cilk (pronounced "silk") is a C-based runtime system for multithreaded parallel programming.  ...  In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and analytically.  ...  Cilk is a runtime system whose work-stealing scheduler is efficient in theory as well as in practice.  ... 
doi:10.1145/209937.209958 fatcat:2vcgudk5qraljetz2zbbp3cgr4

The Cilk++ concurrency platform

Charles E. Leiserson
2009 Proceedings of the 46th Annual Design Automation Conference on ZZZ - DAC '09  
The Cilk++ runtime system guarantees to load-balance computations effectively.  ...  This paper overviews the Cilk++ programming environment, which incorporates a compiler, a runtime system, and a race-detection tool.  ...  Acknowledgments Many thanks to the great team at Cilk Arts and to our many customers who have helped us refine the Cilk++ system.  ... 
doi:10.1145/1629911.1630048 dblp:conf/dac/Leiserson09 fatcat:5oenlyp7gvfidgh2snrrik7vdi

Reducers and other Cilk++ hyperobjects

Matteo Frigo, Pablo Halpern, Charles E. Leiserson, Stephen Lewin-Berlin
2009 Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures - SPAA '09  
language that enables multicore programming in the style of MIT Cilk.  ...  This paper introduces hyperobjects, a linguistic mechanism that allows different branches of a multithreaded program to maintain coordinated local views of the same nonlocal variable.  ...  ACKNOWLEDGMENTS A big thanks to the great engineering team at Cilk Arts. Thanks especially to John Carr, who contributed mightily to the implementation and optimization of reducers.  ... 
doi:10.1145/1583991.1584017 dblp:conf/spaa/FrigoHLL09 fatcat:jxogl7xnunht5oo3ikldllcb3i

A Comparative Study of Asynchronous Many-Tasking Runtimes: Cilk, Charm++, ParalleX and AM++ [article]

Abhishek Kulkarni, Andrew Lumsdaine
2019 arXiv   pre-print
We evaluate and compare four contemporary and emerging runtimes for high-performance computing(HPC) applications: Cilk, Charm++, ParalleX and AM++.  ...  We also evaluate four mature implementations of these runtimes, namely: Intel Cilk++, Charm++ 6.5.1, AM++ and HPX, that embody the principles dictated by these models.  ...  The Cilk multithreaded runtime system originally developed for the Connection Machine CM5 had support for distributed shared memory implemented in software.  ... 
arXiv:1904.00518v1 fatcat:euvfhakryzcbdmhpbrrxrfu6he

Efficient Detection of Determinacy Races in Cilk Programs

M. Feng
1999 Theory of Computing Systems  
We have implemented a provably efficient determinacy-race detector for Cilk, an algorithmic multithreaded programming language.  ...  The core of the Nondeterminator is an asymptotically efficient serial algorithm (inspired by Tarjan's nearly linear-time leastcommon-ancestors algorithm) for detecting determinacy races in series-parallel  ...  of MIT-provided many helpful suggestions and donated their Cilk application programs for testing.  ... 
doi:10.1007/s002240000120 fatcat:ftropx35j5cdlgxj6irodqxbcq

Efficiently Detecting Races in Cilk Programs That Use Reducer Hyperobjects

I-Ting Angelina Lee, Tao B. Schardl
2015 Proceedings of the 27th ACM on Symposium on Parallelism in Algorithms and Architectures - SPAA '15  
in how the Cilk runtime system manages a reducer.  ...  A multithreaded Cilk program that is ostensibly deterministic may nevertheless behave nondeterministically due to programming errors in the code.  ...  Because the Cilk runtime system creates and reduces views based on scheduling, such a read can cause multiple runs of the same Cilk program to produce di↵erent results.  ... 
doi:10.1145/2755573.2755599 dblp:conf/spaa/LeeS15 fatcat:aet5nakznrbx3nqqg7r4ehgc3e

Efficient detection of determinacy races in Cilk programs

Mingdong Feng, Charles E. Leiserson
1997 Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures - SPAA '97  
We have implemented a provabl y efficient determinacy-race detector for Cilk, an algorithmic multithreaded programming language.  ...  The core of the Nondeterrninator is an asymptotically efficient serial algorithm (inspired by Tarjan's nearly linear-time leastcommon-ancestors algorithm) for detecting deterrninacy races in series-parallel  ...  of MIT-provided many helpful suggestions and donated their Cilk application programs for testing.  ... 
doi:10.1145/258492.258493 dblp:conf/spaa/FengL97 fatcat:sju3dflf4rcupnxy2dzv3qnr7q

Language-based vectorization and parallelization using intrinsics, OpenMP, TBB and Cilk Plus

Przemysław Stpiczyński
2018 Journal of Supercomputing  
The aim of this paper is to evaluate OpenMP, TBB and Cilk Plus as basic language-based tools for simple and efficient parallelization of recursively defined computational problems and other problems that  ...  Manual vectorization techniques based on Cilk array notation and intrinsics are presented. We also show how to simplify such optimization using Intel SIMD Data Layout Template containers.  ...  In case of TBB and Cilk, the runtime system has been responsible for load balancing. The parallel loops in the Cilk version have been parallelized using _Cilk_for construct.  ... 
doi:10.1007/s11227-017-2231-3 fatcat:gr7vg37emzg75edfpqrsmbk7tu

Intel Cilk Plus for complex parallel algorithms: "Enormous Fast Fourier Transforms" (EFFT) library

Ryo Asai, Andrey Vladimirov
2015 Parallel Computing  
This work provides a new efficient DFFT implementation, and at the same time demonstrates an educational example of how computer science problems with complex parallel patterns can be optimized for high  ...  performance using the Intel Cilk Plus framework.  ...  For instance, if we had T = 24 workers and b = 32 bins, the runtime system would first distribute 24 bins across the 24 workers.  ... 
doi:10.1016/j.parco.2015.05.004 fatcat:chdit7gmkvfhthvqvzts22ygru

Multi-Core Program Optimization: Parallel Sorting Algorithms in Intel Cilk Plus

Sabahat Saleem, M. IkramUllah Lali, M. Saqib Nawaz, Abou Bakar Nauman
2014 International Journal of Hybrid Information Technology  
Intel Cilk Plus is a C based computing system that presents a straight forward and well-structured model for the development, verification and analysis of multicore and parallel programming.  ...  New performance leaps has been achieved with multiprogramming and multi-core systems.  ...  It informs the runtime system that spawned function can run in parallel. Cilk_sync keyword is used to wait for the spawned procedures to complete.  ... 
doi:10.14257/ijhit.2014.7.2.15 fatcat:agv7ks7okfcndktyarryn3thye

Parallel fast Fourier transform in SPMD style of Cilk

Tien Hsiung Weng, Teng Xian Wang, Meng Yen Hsieh, Hai Jiang, Jun Shen, Kuan Ching Li
2019 International Journal of Embedded Systems  
As a highly compact designed code, this code is compared with a highly tuned parallel recursive fast Fourier transform (FFT) using Cilk, which is included in Cilk package of version 5.4.6.  ...  In this paper, we propose a parallel one-dimensional non-recursive fast Fourier transform (FFT) program based on conventional Cooley-Tukey's algorithm written in C using Cilk in single program multiple  ...  Parallel fast Fourier transform in SPMD style of Cilk  ... 
doi:10.1504/ijes.2019.103998 fatcat:xwsgbgzdb5bctcuw5cq4w7twgq

Comparing Parallel Simulation of Social Agents Using Cilk and OpenCL

Dominik Moser, Andreas Riener, Kashif Zia, Alois Ferscha
2011 2011 IEEE/ACM 15th International Symposium on Distributed Simulation and Real Time Applications  
Simulation efficiency for two realistic models with varying complexity on a scale of 10 7 agents has shown the usefulness of both approaches.  ...  To this end, we have performed simulation runs with parameter variation on a real parallel and distributed hardware platform using Cilk as well as on a GPU employing OpenCL.  ...  Hence, the complexity of the system increases exponentially with an increase in the number of entities (humans, devices) in the system.  ... 
doi:10.1109/ds-rt.2011.12 dblp:conf/dsrt/MoserRZF11 fatcat:u2ppzoiz2vhxpihddfei3oveju

Factory: An Object-Oriented Parallel Programming Substrate for Deep Multiprocessors [chapter]

Scott Schneider, Christos D. Antonopoulos, Dimitrios S. Nikolopoulos
2005 Lecture Notes in Computer Science  
These processors are used as building blocks of shared-memory multiprocessor systems, or clusters of multiprocessors.  ...  Moreover, Factory offers programmability and performance comparable to already established multithreading substrates.  ...  Cilk [8] is an extension to C with explicit support for multithreaded programming.  ... 
doi:10.1007/11557654_28 fatcat:kd7m5pwvxfc35pt4bv2z3sfsqe

Memory-mapping support for reducer hyperobjects

I-Ting Angelina Lee, Aamir Shafi, Charles E. Leiserson
2012 Proceedinbgs of the 24th ACM symposium on Parallelism in algorithms and architectures - SPAA '12  
We replaced the Intel Cilk Plus runtime system with our own Cilk-M runtime system which uses TLMM to implement a reducer mechanism that supports a reducer lookup using only two memory accesses and a predictable  ...  An empirical evaluation shows that the Cilk-M memory-mapping approach is close to 4× faster than the Cilk Plus hypermap approach.  ...  Thanks to Pablo Halpern of Intel, one of the original designers of reducers and a Cilk Plus developer, for helpful discussions on the implementation of reducers in Cilk Plus.  ... 
doi:10.1145/2312005.2312056 dblp:conf/spaa/LeeSL12 fatcat:ewpy3jsrczbqdb76gfgilh5jvu
« Previous Showing results 1 — 15 out of 643 results