A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2008; you can also visit the original URL.
The file type is application/pdf
.
Filters
Performance Evaluation of OpenMP Applications with Nested Parallelism
[chapter]
2000
Lecture Notes in Computer Science
there is a signicant scalability benet for applications with nested parallelism. ...
Experimental results on Sun Ultra Enterprise 10000 with up to 60 processors show that overhead imposed by nested parallelism is very small (1-3% in ve out of six applications, and 8% for the other), and ...
The performance of our implementation is evaluated using several applications with both at and nested parallelism. ...
doi:10.1007/3-540-40889-4_8
fatcat:zjiqxa365beqle3iyidvclopki
Performance Evaluation of a Multi-Zone Application in Different OpenMP Approaches
2008
International journal of parallel programming
We examine the performance impact of these extensions to OpenMP on a large shared memory machine and compare with hybrid and nested OpenMP programming models. ...
We describe a performance study of a multi-zone application benchmark implemented in several OpenMP approaches that exploit multi-level parallelism and deal with unbalanced workload. ...
Acknowledgements The authors would like to acknowledge fruitful discussions with Robert Hood, Johnny Chang, and support from the staff at NAS division for many experiments conducted on the Columbia supercomputer ...
doi:10.1007/s10766-008-0074-5
fatcat:rfrjj5zlbfbwvnmjkwi6mc7kcy
Supporting Nested OpenMP Parallelism in the TAU Performance System
2007
International journal of parallel programming
Nested OpenMP parallelism allows an application to spawn teams of nested threads. ...
Our approach has been implemented in the TAU performance system and has been successfully used in profiling and tracing OpenMP applications with nested parallelism. ...
Acknowledgments Research at the University of Oregon is sponsored by contracts (DE-FG02-05ER25663, DE-FG02-05ER25680) from the MICS program of the U.S. Dept. of Energy, Office of Science. ...
doi:10.1007/s10766-007-0050-5
fatcat:by7cw3cgo5chbnwvaou24pkjse
Supporting Nested OpenMP Parallelism in the TAU Performance System
[chapter]
2008
Lecture Notes in Computer Science
Nested OpenMP parallelism allows an application to spawn teams of nested threads. ...
Our approach has been implemented in the TAU performance system and has been successfully used in profiling and tracing OpenMP applications with nested parallelism. ...
Acknowledgments Research at the University of Oregon is sponsored by contracts (DE-FG02-05ER25663, DE-FG02-05ER25680) from the MICS program of the U.S. Dept. of Energy, Office of Science. ...
doi:10.1007/978-3-540-68555-5_23
fatcat:wxbhs7yp75fzvn3dqsookotbiq
The Design of OpenMP Tasks
2009
IEEE Transactions on Parallel and Distributed Systems
We compare a prototype implementation of the tasking model with existing models, and evaluate it on a wide range of applications. ...
With increasing application complexity, there is a growing need for addressing irregular parallelism in the presence of complicated control structures. ...
ACKNOWLEDGMENTS The authors would like to acknowledge the rest of participants in the tasking subcommittee (Brian Bliss, Mark Bull, Eric Duncan, Roger Ferrer, Grant Haab, Diana King, ...
doi:10.1109/tpds.2008.105
fatcat:kbfeh3kxjzaddlknavazuowlzu
Guest Editors' Introduction
2009
International journal of parallel programming
of and control over nested parallel regions (including new API routines to determine nesting structure). ...
IWOMP is the annual series of international workshops dedicated to the promotion and advancement of all aspects focusing on parallel programming with OpenMP. ...
threads with similar behavior, making simpler the understanding of the application. ...
doi:10.1007/s10766-009-0095-8
fatcat:qoettu3vunhepbymx6rwfm2tfq
Guest Editors' Introduction
2010
International journal of parallel programming
IWOMP is the annual series of international workshops dedicated to the promotion and advancement of all aspects focusing on parallel programming with OpenMP. ...
Version 3.0 was released in 2008, adding several new features to the OpenMP specification, including: tasking, loop collapse, enhanced loop schedules, and better definition of and control over nested parallel ...
The simplicity of the parallel program and its scalability is compared to solutions like nested parallelism. The paper "Performance Evaluation of Mixed-mode OpenMP/MPI Implementations" by J. ...
doi:10.1007/s10766-010-0141-6
fatcat:nqapfsptszfwlhicsfcphbvuim
Task-Parallel Reductions in OpenMP and OmpSs
[chapter]
2014
Lecture Notes in Computer Science
Further we evaluate its implications on the OpenMP standard and present a prototype implementation in OmpSs. ...
The wide adoption of parallel processing hardware in mainstream computing as well as the raising interest for efficient parallel programming in the developer community increase the demand for parallel ...
Task-parallel reductions with OpenMP The idea to support task-parallel reductions builds on top of the conceptual framework introduced with explicit tasking in OpenMP. ...
doi:10.1007/978-3-319-11454-5_1
fatcat:strkf5tlirdxfojy5ozcqpdtxa
On the adequacy of lightweight thread approaches for high-level parallel programming models
2018
Future generations computer systems
The implementations of OpenMP usually rely on version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/ benefit application performance. ...
High-level parallel programming models (PMs) are becoming crucial in order to extract the computational power of current on-node multi-threaded parallelism. ...
GLTO Nested Parallelism
225 Nested parallel codes are not common inside applications because its management is not as well designed as the parallel coarse-grained scenarios causing a performance drop ...
doi:10.1016/j.future.2018.02.016
fatcat:pbo2kyo4sjgzppbxjo2cf7ofza
GLTO: On the Adequacy of Lightweight Thread Approaches for OpenMP Implementations
2017
2017 46th International Conference on Parallel Processing (ICPP)
OpenMP is the de facto standard application programming interface (API) for on-node parallelism. ...
However, a recent trend in runtimes/applications points in the direction of leveraging massive on-node parallelism in conjunction with fine-grained and dynamic scheduling paradigms. ...
OpenMP with Nested Parallelism Nested parallelism is not a common OpenMP pattern, but it may appear hidden to the user. ...
doi:10.1109/icpp.2017.15
dblp:conf/icpp/CastelloSMBQP17
fatcat:varshzcssbeq3mlytl5lqctiiq
Employing nested OpenMP for the parallelization of multi-zone computational fluid dynamics applications
2006
Journal of Parallel and Distributed Computing
We report the benchmark results, compare the timings with those of difSerent hybrid parallelization paradigms and discuss OpenMP implementation issues which effect the performance of multi-level parallel ...
In this paper we describe the parallelization of the multi-zone code versions of the NAS Parallel Benchmarks employing multi-level OpenMP parallelism. ...
Acknowledgments This work was supported by NASA contract DTTS59-99-D-O0437/A618 12D with Computer Sciences Corporation/AMTI and Corporation and by the Spanish Ministry of Science and Technology under contract ...
doi:10.1016/j.jpdc.2005.06.019
fatcat:jjand5nfavgh3cyivjawso7vde
Techniques for Enabling Highly Efficient Message Passing on Many-Core Architectures
2015
2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
Many-core architecture provides a massively parallel environment with dozens of cores and hundreds of hardware threads. ...
While application programmers have studied several approaches to achieve better parallelism and resource sharing, many of those approaches still face communication problems that degrade performance. ...
Moreover, many implementations of the OpenMP runtime do not schedule work units from nested OpenMP parallel regions efficiently. ...
doi:10.1109/ccgrid.2015.68
dblp:conf/ccgrid/SiBI15
fatcat:edzfgmxrhfhvxbohfatxdei5ly
Exploiting Fine-Grain Thread Parallelism on Multicore Architectures
2009
Scientific Programming
Its architecture encompasses a number of optimizations that make it particularly effective in managing a large number of threads and with low overheads. ...
We evaluate our implementation on two multicore systems using synthetic microbenchmarks and a real-time face detection application. ...
Evaluating nested parallelism based on application speedups [1, 25] gives overall performance indications but does not reveal potential construct-specific problems. ...
doi:10.1155/2009/249651
fatcat:tjc25fxytnfgve5o2j5ijq22uy
A high-performance face detection system using OpenMP
2009
Concurrency and Computation
We present the development of a novel high-performance face detection system using a neural network-based classification algorithm and an efficient parallelization with OpenMP. ...
Our parallelization strategy starts with one level of threads and moves to the exploitation of nested parallel regions in order to further improve, by up to 19%, the image processing capability. ...
Our parallelization strategy also includes nested parallelization of the face detection system. Nested parallelism is a major feature of OpenMP that can improve application performance in many cases. ...
doi:10.1002/cpe.1389
fatcat:upbfotjux5cbbe4qbqnbyi4o4a
OpenMP tasks in IBM XL compilers
2008
Proceedings of the 2008 conference of the center for advanced studies on collaborative research meeting of minds - CASCON '08
We also present a performance evaluation of our implementation on a set of benchmarks and applications. ...
It was introduced to handle unstructured parallelism and broaden the range of applications that can be parallelized by OpenMP. ...
no. 27648), the HiPEAC network of Excellence (IST-004408), and the Spanish Ministry of Education (contract no. ...
doi:10.1145/1463788.1463810
dblp:conf/cascon/TeruelUMASZT08
fatcat:6fc55xh7tvb5tiq2mmtq2hetsa
« Previous
Showing results 1 — 15 out of 3,712 results