Filters








3,712 Hits in 5.6 sec

Performance Evaluation of OpenMP Applications with Nested Parallelism [chapter]

Yoshizumi Tanaka, Kenjiro Taura, Mitsuhisa Sato, Akinori Yonezawa
2000 Lecture Notes in Computer Science  
there is a signicant scalability benet for applications with nested parallelism.  ...  Experimental results on Sun Ultra Enterprise 10000 with up to 60 processors show that overhead imposed by nested parallelism is very small (1-3% in ve out of six applications, and 8% for the other), and  ...  The performance of our implementation is evaluated using several applications with both at and nested parallelism.  ... 
doi:10.1007/3-540-40889-4_8 fatcat:zjiqxa365beqle3iyidvclopki

Performance Evaluation of a Multi-Zone Application in Different OpenMP Approaches

Haoqiang Jin, Barbara Chapman, Lei Huang, Dieter an Mey, Thomas Reichstein
2008 International journal of parallel programming  
We examine the performance impact of these extensions to OpenMP on a large shared memory machine and compare with hybrid and nested OpenMP programming models.  ...  We describe a performance study of a multi-zone application benchmark implemented in several OpenMP approaches that exploit multi-level parallelism and deal with unbalanced workload.  ...  Acknowledgements The authors would like to acknowledge fruitful discussions with Robert Hood, Johnny Chang, and support from the staff at NAS division for many experiments conducted on the Columbia supercomputer  ... 
doi:10.1007/s10766-008-0074-5 fatcat:rfrjj5zlbfbwvnmjkwi6mc7kcy

Supporting Nested OpenMP Parallelism in the TAU Performance System

Alan Morris, Allen D. Malony, Sameer S. Shende
2007 International journal of parallel programming  
Nested OpenMP parallelism allows an application to spawn teams of nested threads.  ...  Our approach has been implemented in the TAU performance system and has been successfully used in profiling and tracing OpenMP applications with nested parallelism.  ...  Acknowledgments Research at the University of Oregon is sponsored by contracts (DE-FG02-05ER25663, DE-FG02-05ER25680) from the MICS program of the U.S. Dept. of Energy, Office of Science.  ... 
doi:10.1007/s10766-007-0050-5 fatcat:by7cw3cgo5chbnwvaou24pkjse

Supporting Nested OpenMP Parallelism in the TAU Performance System [chapter]

Alan Morris, Allen D. Malony, Sameer S. Shende
2008 Lecture Notes in Computer Science  
Nested OpenMP parallelism allows an application to spawn teams of nested threads.  ...  Our approach has been implemented in the TAU performance system and has been successfully used in profiling and tracing OpenMP applications with nested parallelism.  ...  Acknowledgments Research at the University of Oregon is sponsored by contracts (DE-FG02-05ER25663, DE-FG02-05ER25680) from the MICS program of the U.S. Dept. of Energy, Office of Science.  ... 
doi:10.1007/978-3-540-68555-5_23 fatcat:wxbhs7yp75fzvn3dqsookotbiq

The Design of OpenMP Tasks

E. Ayguade, N. Copty, A. Duran, J. Hoeflinger, Yuan Lin, F. Massaioli, X. Teruel, P. Unnikrishnan, Guansong Zhang
2009 IEEE Transactions on Parallel and Distributed Systems  
We compare a prototype implementation of the tasking model with existing models, and evaluate it on a wide range of applications.  ...  With increasing application complexity, there is a growing need for addressing irregular parallelism in the presence of complicated control structures.  ...  ACKNOWLEDGMENTS The authors would like to acknowledge the rest of participants in the tasking subcommittee (Brian Bliss, Mark Bull, Eric Duncan, Roger Ferrer, Grant Haab, Diana King,  ... 
doi:10.1109/tpds.2008.105 fatcat:kbfeh3kxjzaddlknavazuowlzu

Guest Editors' Introduction

Rudolf Eigenmann, Eduard Ayguadé
2009 International journal of parallel programming  
of and control over nested parallel regions (including new API routines to determine nesting structure).  ...  IWOMP is the annual series of international workshops dedicated to the promotion and advancement of all aspects focusing on parallel programming with OpenMP.  ...  threads with similar behavior, making simpler the understanding of the application.  ... 
doi:10.1007/s10766-009-0095-8 fatcat:qoettu3vunhepbymx6rwfm2tfq

Guest Editors' Introduction

Matthias S. Müller, Eduard Ayguadé
2010 International journal of parallel programming  
IWOMP is the annual series of international workshops dedicated to the promotion and advancement of all aspects focusing on parallel programming with OpenMP.  ...  Version 3.0 was released in 2008, adding several new features to the OpenMP specification, including: tasking, loop collapse, enhanced loop schedules, and better definition of and control over nested parallel  ...  The simplicity of the parallel program and its scalability is compared to solutions like nested parallelism. The paper "Performance Evaluation of Mixed-mode OpenMP/MPI Implementations" by J.  ... 
doi:10.1007/s10766-010-0141-6 fatcat:nqapfsptszfwlhicsfcphbvuim

Task-Parallel Reductions in OpenMP and OmpSs [chapter]

Jan Ciesko, Sergi Mateo, Xavier Teruel, Vicenç Beltran, Xavier Martorell, Rosa M. Badia, Eduard Ayguadé, Jesús Labarta
2014 Lecture Notes in Computer Science  
Further we evaluate its implications on the OpenMP standard and present a prototype implementation in OmpSs.  ...  The wide adoption of parallel processing hardware in mainstream computing as well as the raising interest for efficient parallel programming in the developer community increase the demand for parallel  ...  Task-parallel reductions with OpenMP The idea to support task-parallel reductions builds on top of the conceptual framework introduced with explicit tasking in OpenMP.  ... 
doi:10.1007/978-3-319-11454-5_1 fatcat:strkf5tlirdxfojy5ozcqpdtxa

On the adequacy of lightweight thread approaches for high-level parallel programming models

Adrián Castelló, Rafael Mayo, Kevin Sala, Vicenç Beltran, Pavan Balaji, Antonio J. Peña
2018 Future generations computer systems  
The implementations of OpenMP usually rely on version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/ benefit application performance.  ...  High-level parallel programming models (PMs) are becoming crucial in order to extract the computational power of current on-node multi-threaded parallelism.  ...  GLTO Nested Parallelism 225 Nested parallel codes are not common inside applications because its management is not as well designed as the parallel coarse-grained scenarios causing a performance drop  ... 
doi:10.1016/j.future.2018.02.016 fatcat:pbo2kyo4sjgzppbxjo2cf7ofza

GLTO: On the Adequacy of Lightweight Thread Approaches for OpenMP Implementations

Adrian Castello, Sangmin Seo, Rafael Mayo, Pavan Balaji, Enrique S. Quintana-Orti, Antonio J. Pena
2017 2017 46th International Conference on Parallel Processing (ICPP)  
OpenMP is the de facto standard application programming interface (API) for on-node parallelism.  ...  However, a recent trend in runtimes/applications points in the direction of leveraging massive on-node parallelism in conjunction with fine-grained and dynamic scheduling paradigms.  ...  OpenMP with Nested Parallelism Nested parallelism is not a common OpenMP pattern, but it may appear hidden to the user.  ... 
doi:10.1109/icpp.2017.15 dblp:conf/icpp/CastelloSMBQP17 fatcat:varshzcssbeq3mlytl5lqctiiq

Employing nested OpenMP for the parallelization of multi-zone computational fluid dynamics applications

Eduard Ayguade, Marc Gonzalez, Xavier Martorell, Gabriele Jost
2006 Journal of Parallel and Distributed Computing  
We report the benchmark results, compare the timings with those of difSerent hybrid parallelization paradigms and discuss OpenMP implementation issues which effect the performance of multi-level parallel  ...  In this paper we describe the parallelization of the multi-zone code versions of the NAS Parallel Benchmarks employing multi-level OpenMP parallelism.  ...  Acknowledgments This work was supported by NASA contract DTTS59-99-D-O0437/A618 12D with Computer Sciences Corporation/AMTI and Corporation and by the Spanish Ministry of Science and Technology under contract  ... 
doi:10.1016/j.jpdc.2005.06.019 fatcat:jjand5nfavgh3cyivjawso7vde

Techniques for Enabling Highly Efficient Message Passing on Many-Core Architectures

Min Si, Pavan Balaji, Yutaka Ishikawa
2015 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing  
Many-core architecture provides a massively parallel environment with dozens of cores and hundreds of hardware threads.  ...  While application programmers have studied several approaches to achieve better parallelism and resource sharing, many of those approaches still face communication problems that degrade performance.  ...  Moreover, many implementations of the OpenMP runtime do not schedule work units from nested OpenMP parallel regions efficiently.  ... 
doi:10.1109/ccgrid.2015.68 dblp:conf/ccgrid/SiBI15 fatcat:edzfgmxrhfhvxbohfatxdei5ly

Exploiting Fine-Grain Thread Parallelism on Multicore Architectures

P.E. Hadjidoukas, G.Ch. Philos, V.V. Dimakopoulos
2009 Scientific Programming  
Its architecture encompasses a number of optimizations that make it particularly effective in managing a large number of threads and with low overheads.  ...  We evaluate our implementation on two multicore systems using synthetic microbenchmarks and a real-time face detection application.  ...  Evaluating nested parallelism based on application speedups [1, 25] gives overall performance indications but does not reveal potential construct-specific problems.  ... 
doi:10.1155/2009/249651 fatcat:tjc25fxytnfgve5o2j5ijq22uy

A high-performance face detection system using OpenMP

P. E. Hadjidoukas, V. V. Dimakopoulos, M. Delakis, C. Garcia
2009 Concurrency and Computation  
We present the development of a novel high-performance face detection system using a neural network-based classification algorithm and an efficient parallelization with OpenMP.  ...  Our parallelization strategy starts with one level of threads and moves to the exploitation of nested parallel regions in order to further improve, by up to 19%, the image processing capability.  ...  Our parallelization strategy also includes nested parallelization of the face detection system. Nested parallelism is a major feature of OpenMP that can improve application performance in many cases.  ... 
doi:10.1002/cpe.1389 fatcat:upbfotjux5cbbe4qbqnbyi4o4a

OpenMP tasks in IBM XL compilers

Xavier Teruel, Priya Unnikrishnan, Xavier Martorell, Eduard Ayguadé, Raul Silvera, Guansong Zhang, Ettore Tiotto
2008 Proceedings of the 2008 conference of the center for advanced studies on collaborative research meeting of minds - CASCON '08  
We also present a performance evaluation of our implementation on a set of benchmarks and applications.  ...  It was introduced to handle unstructured parallelism and broaden the range of applications that can be parallelized by OpenMP.  ...  no. 27648), the HiPEAC network of Excellence (IST-004408), and the Spanish Ministry of Education (contract no.  ... 
doi:10.1145/1463788.1463810 dblp:conf/cascon/TeruelUMASZT08 fatcat:6fc55xh7tvb5tiq2mmtq2hetsa
« Previous Showing results 1 — 15 out of 3,712 results