Filters








1,020 Hits in 4.8 sec

Performance of parallel communication and spawning primitives on a Linux cluster

David J. Johnston, Martin Fleury, Michael Lincoln, Andrew C. Downton
2006 Cluster Computing  
The results expose a ranking in terms of process spawning and a similar ranking of communication software performance.  ...  by layering on top of an existing set of parallel primitives.  ...  Stelios Bounanos is thanked for his effective adminstration of the cluster and its software.  ... 
doi:10.1007/s10586-006-0007-2 fatcat:kluswahacrfmbc4glh6bjado5y

ScELA: Scalable and Extensible Launching Architecture for Clusters [chapter]

Jaidev K. Sridhar, Matthew J. Koop, Jonathan L. Perkins, Dhabaleswar K. Panda
2008 Lecture Notes in Computer Science  
We present the design of a scalable, extensible and high-performance job launch architecture for very large scale parallel computing.  ...  The job launch process includes two phases -spawning of processes on processors and information exchange between processes for job initialization.  ...  from Intel, Mellanox, Cisco, and Sun Microsystems; Equipment donations from Intel, Mellanox, AMD, Advanced Clustering, Appro, QLogic, and Sun Microsystems.  ... 
doi:10.1007/978-3-540-89894-8_30 fatcat:lkup7fhgm5frzclqlsxm633qpe

The Architecture and Performance of WMPI II [chapter]

Anders Lyhne Christensen, João Brito, João Gabriel Silva
2004 Lecture Notes in Computer Science  
WMPI II is the only commercial implementation of MPI 2.0 that runs on both Windows and Linux clusters.  ...  It supports both 32 and 64 bit versions and mixed clusters of Windows and Linux nodes. This paper describes the main design decisions and the multithreaded, non-polling architecture of WMPI II.  ...  The results of running the NAS parallel benchmarks (class B tests) are shown in Fig. 2c , and the Linpack scores are shown in Fig. 2d Linux Results The Linux benchmarks were run on a cluster of 16  ... 
doi:10.1007/978-3-540-30218-6_21 fatcat:qtgrbl4mdncafgclsgotv6c4cm

FOSS-Based Grid Computing [article]

A. Mani
2015 arXiv   pre-print
It is based on a technical report entitled 'Grid-Computing Using GNU/Linux' by the present author. This article was written in 2006 and should be of historical interest.  ...  In this expository paper we will be primarily concerned with core aspects of Grids and Grid computing using free and open-source software with some emphasis on utility computing.  ...  A spawn is the parallel analog of a C function call, and like a C function call, when a Cilk procedure is spawned, execution proceeds to the child.  ... 
arXiv:cs/0608122v4 fatcat:3r2ahqfo7na5tiiysgxsw77hva

Performance of Multicore Systems on Parallel Data Clustering with Deterministic Annealing [chapter]

Xiaohong Qiu, Geoffrey C. Fox, Huapeng Yuan, Seung-Hee Bae, George Chrysanthakopoulos, Henrik Frystyk Nielsen
2008 Lecture Notes in Computer Science  
We present a performance analysis that compares MPI and a new messaging runtime library CCR (Concurrency and Coordination Runtime) with Windows and Linux and using both threads and processes.  ...  We give results on message latency and bandwidth for two processor multicore systems based on AMD and Intel architectures with a total of four and eight cores.  ...  It is interesting that the parallel kernels of most datamining algorithms are similar to those well studied by the high performance (scientific) computing community and often need the synchronization primitives  ... 
doi:10.1007/978-3-540-69384-0_46 fatcat:hkdvthbf5zf6ha5uecppr7woue

The Design of an API for Strict Multithreading in C++ [chapter]

Wolfgang Blochinger, Wolfgang Küchlin
2003 Lecture Notes in Computer Science  
The API is part of the Distributed Object-Oriented Threads System DOTS. The DOTS environment provides support for strict multithreaded computations on highly heterogeneous networks of workstations.  ...  This paper deals with the design of an API for building distributed parallel applications in C++ which embody strict multithreaded computations.  ...  and IBM Parallel Sysplex Cluster (clusters of IBM S/390 mainframes running under OS /390) [4] .  ... 
doi:10.1007/978-3-540-45209-6_101 fatcat:4no4nako2vg2zeniquokhgy5ge

A case for scaling applications to many-core with OS clustering

Xiang Song, Haibo Chen, Rong Chen, Yuanxuan Wang, Binyu Zang
2011 Proceedings of the sixth conference on Computer systems - EuroSys '11  
Cerberus extends a traditional VMM with efficient support for resource sharing and communication among the clustered operating systems.  ...  up to 1.74X and 4.95X performance speedup compared to native Linux.  ...  Acknowledgments We thank our shepherd Andrew Baumann and the anonymous reviewers for their detailed and insightful comments.  ... 
doi:10.1145/1966445.1966452 dblp:conf/eurosys/SongCCWZ11 fatcat:lmxrk6q3qvdejed4gqktlwjgyu

Load Balancing for the Electronic Structure Program GREMLIN in a Very Heterogenous SSH-Connected WAN-Cluster of UNIX-Type Hosts [chapter]

Siegfried Höfinger
2001 Lecture Notes in Computer Science  
Taking into account these various speed data within a special dedicated load balancing tool in an initial execution stage of GREMLIN, may lead to a rather well balanced parallel performance and good scaling  ...  Operating-Systems, architectures and cpu-performances of all the 5 machines vary from LINUX-2.2.14/INTEL PPro-200MHz, over LINUX-2.2.13/INTEL PII-350MHz, OSF I V5.0/ALPHA EV6-500MHz, IRIX64 6.5/MIPS R10000  ...  Volkert from GUP Linz and Dr. Romaric David from ICPS Strasbourg for providing access to their supercomputer facilities.  ... 
doi:10.1007/3-540-45718-6_85 fatcat:okjxmurakracfnep4jj2uvo7ei

An evaluation of message passing implementations on Beowulf workstations

P.H. Carns, W.B. Ligon, S.P. McMillan, R.B. Ross
1999 1999 IEEE Aerospace Conference. Proceedings (Cat. No.99TH8403)  
One of the key building blocks of parallel applications on Beowulf workstations is a message passing library.  ...  This paper examines a set of four message passing libraries available for Beowulf workstations, focusing on their features, implementation, reliability, and performance.  ...  These "Pile-of-PCs" consist of a cluster of machines dedicated as nodes in a parallel processor, built entirely from commodity off the shelf parts, and employing a private system area network for communication  ... 
doi:10.1109/aero.1999.790188 fatcat:tcuqxy2fm5f6fbcielc26sgf2u

RoCL: A Resource Oriented Communication Library [chapter]

Albano Alves, António Pina, José Exposto, José Rufino
2003 Lecture Notes in Computer Science  
RoCL is a communication library that aims to exploit the low-level communication facilities of today's cluster networking hardware and to merge, via the resource oriented paradigm, those facilities and  ...  the high-level degree of parallelism achieved on SMP systems through multi-threading.  ...  Table 2 . 2 RoCL primitives to handle attribute lists. Currently RoCL is only supported under LINUX. 2 GM runs on Myrinet, MVIA runs on SysKonnect and Intel and UDP runs on each of them.  ... 
doi:10.1007/978-3-540-45209-6_133 fatcat:hpxevls47nfnvg3fqp2np6ficm

Advanced Hybrid MPI/OpenMP Parallelization Paradigms for Nested Loop Algorithms onto Clusters of SMPs [chapter]

Nikolaos Drosinos, Nectarios Koziris
2003 Lecture Notes in Computer Science  
We implement the three variations and perform a number of micro-kernel benchmarks to verify the intuition that the hybrid programming model could potentially exploit the characteristics of an SMP cluster  ...  The parallelization process of nested-loop algorithms onto popular multi-level parallel architectures, such as clusters of SMPs, is not a trivial issue, since the existence of data dependencies in the  ...  The communication times follow a more irregular pattern, but on average they indicate a superior performance of the pure MPI model.  ... 
doi:10.1007/978-3-540-39924-7_30 fatcat:nsensostijb2bixnghpbxpyxau

High Performance Multi-paradigm Messaging Runtime Integrating Grids and Multicore Systems

Xiaohong Qiu, Geoffrey C. Fox, Huapeng Yuan, Seung-Hee Bae, George Chrysanthakopoulos, Henrik Frystyk Nielsen
2007 Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)  
Users will want to compose heterogeneous components into single jobs and run seamlessly in both distributed fashion and on a future "Grid on a chip" with different subsets of cores supporting individual  ...  Our work uses managed code (C#) and for AMD and Intel processors shows around a factor of 5 better performance than Java.  ...  parallel problems have no significant inter-process communication and are often executed on a Grid.  ... 
doi:10.1109/e-science.2007.42 dblp:conf/eScience/QiuFYBCN07 fatcat:4brtlbdwfnchxk6vhf6ti7wkfa

An Introduction to Balder — An OpenMP Run-time Library for Clusters of SMPs [chapter]

Sven Karlsson
2008 Lecture Notes in Computer Science  
The run-time library presented can be used on SMPs and clusters of SMPs and it will provide a shared address space on a cluster.  ...  The performance of the library is evaluated and is shown to be competitive when compared to a commercial compiler from Intel.  ...  Nguyen-Thai Nguyen-Phan has implemented parts of the work-sharing, memory allocation and distributed lock primitives. He has been instrumental in testing the library. The  ... 
doi:10.1007/978-3-540-68555-5_7 fatcat:75ow5zbzencyblsts3ypj2hkpy

Efficient load balancing for wide-area divide-and-conquer applications

Rob V. van Nieuwpoort, Thilo Kielmann, Henri E. Bal
2001 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming - PPoPP '01  
Divide-and-conquer programs are easily parallelized by letting the programmer annotate potential parallelism in the form of spawn and sync constructs.  ...  Satin extends the Java language with two simple primitives for divide-and-conquer programming: spawn and sync.  ...  We thank John Romein, Grégory Mounié, and Martijn Bot for their helpful comments on a previous version of this manuscript.  ... 
doi:10.1145/379539.379563 dblp:conf/ppopp/NieuwpoortKB01 fatcat:sqo7sxl4sjbmbbqpd2huacnhfm

Efficient load balancing for wide-area divide-and-conquer applications

Rob V. van Nieuwpoort, Thilo Kielmann, Henri E. Bal
2001 SIGPLAN notices  
Divide-and-conquer programs are easily parallelized by letting the programmer annotate potential parallelism in the form of spawn and sync constructs.  ...  Satin extends the Java language with two simple primitives for divide-and-conquer programming: spawn and sync.  ...  We thank John Romein, Grégory Mounié, and Martijn Bot for their helpful comments on a previous version of this manuscript.  ... 
doi:10.1145/568014.379563 fatcat:hsnnqnwmbvctpnlyuxnepocz5e
« Previous Showing results 1 — 15 out of 1,020 results