A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Performance of parallel communication and spawning primitives on a Linux cluster
2006
Cluster Computing
The results expose a ranking in terms of process spawning and a similar ranking of communication software performance. ...
by layering on top of an existing set of parallel primitives. ...
Stelios Bounanos is thanked for his effective adminstration of the cluster and its software. ...
doi:10.1007/s10586-006-0007-2
fatcat:kluswahacrfmbc4glh6bjado5y
ScELA: Scalable and Extensible Launching Architecture for Clusters
[chapter]
2008
Lecture Notes in Computer Science
We present the design of a scalable, extensible and high-performance job launch architecture for very large scale parallel computing. ...
The job launch process includes two phases -spawning of processes on processors and information exchange between processes for job initialization. ...
from Intel, Mellanox, Cisco, and Sun Microsystems; Equipment donations from Intel, Mellanox, AMD, Advanced Clustering, Appro, QLogic, and Sun Microsystems. ...
doi:10.1007/978-3-540-89894-8_30
fatcat:lkup7fhgm5frzclqlsxm633qpe
The Architecture and Performance of WMPI II
[chapter]
2004
Lecture Notes in Computer Science
WMPI II is the only commercial implementation of MPI 2.0 that runs on both Windows and Linux clusters. ...
It supports both 32 and 64 bit versions and mixed clusters of Windows and Linux nodes. This paper describes the main design decisions and the multithreaded, non-polling architecture of WMPI II. ...
The results of running the NAS parallel benchmarks (class B tests) are shown in Fig. 2c , and the Linpack scores are shown in Fig. 2d
Linux Results The Linux benchmarks were run on a cluster of 16 ...
doi:10.1007/978-3-540-30218-6_21
fatcat:qtgrbl4mdncafgclsgotv6c4cm
FOSS-Based Grid Computing
[article]
2015
arXiv
pre-print
It is based on a technical report entitled 'Grid-Computing Using GNU/Linux' by the present author. This article was written in 2006 and should be of historical interest. ...
In this expository paper we will be primarily concerned with core aspects of Grids and Grid computing using free and open-source software with some emphasis on utility computing. ...
A spawn is the parallel analog of a C function call, and like a C function call, when a Cilk procedure is spawned, execution proceeds to the child. ...
arXiv:cs/0608122v4
fatcat:3r2ahqfo7na5tiiysgxsw77hva
Performance of Multicore Systems on Parallel Data Clustering with Deterministic Annealing
[chapter]
2008
Lecture Notes in Computer Science
We present a performance analysis that compares MPI and a new messaging runtime library CCR (Concurrency and Coordination Runtime) with Windows and Linux and using both threads and processes. ...
We give results on message latency and bandwidth for two processor multicore systems based on AMD and Intel architectures with a total of four and eight cores. ...
It is interesting that the parallel kernels of most datamining algorithms are similar to those well studied by the high performance (scientific) computing community and often need the synchronization primitives ...
doi:10.1007/978-3-540-69384-0_46
fatcat:hkdvthbf5zf6ha5uecppr7woue
The Design of an API for Strict Multithreading in C++
[chapter]
2003
Lecture Notes in Computer Science
The API is part of the Distributed Object-Oriented Threads System DOTS. The DOTS environment provides support for strict multithreaded computations on highly heterogeneous networks of workstations. ...
This paper deals with the design of an API for building distributed parallel applications in C++ which embody strict multithreaded computations. ...
and IBM Parallel Sysplex Cluster (clusters of IBM S/390 mainframes running under OS /390) [4] . ...
doi:10.1007/978-3-540-45209-6_101
fatcat:4no4nako2vg2zeniquokhgy5ge
A case for scaling applications to many-core with OS clustering
2011
Proceedings of the sixth conference on Computer systems - EuroSys '11
Cerberus extends a traditional VMM with efficient support for resource sharing and communication among the clustered operating systems. ...
up to 1.74X and 4.95X performance speedup compared to native Linux. ...
Acknowledgments We thank our shepherd Andrew Baumann and the anonymous reviewers for their detailed and insightful comments. ...
doi:10.1145/1966445.1966452
dblp:conf/eurosys/SongCCWZ11
fatcat:lmxrk6q3qvdejed4gqktlwjgyu
Load Balancing for the Electronic Structure Program GREMLIN in a Very Heterogenous SSH-Connected WAN-Cluster of UNIX-Type Hosts
[chapter]
2001
Lecture Notes in Computer Science
Taking into account these various speed data within a special dedicated load balancing tool in an initial execution stage of GREMLIN, may lead to a rather well balanced parallel performance and good scaling ...
Operating-Systems, architectures and cpu-performances of all the 5 machines vary from LINUX-2.2.14/INTEL PPro-200MHz, over LINUX-2.2.13/INTEL PII-350MHz, OSF I V5.0/ALPHA EV6-500MHz, IRIX64 6.5/MIPS R10000 ...
Volkert from GUP Linz and Dr. Romaric David from ICPS Strasbourg for providing access to their supercomputer facilities. ...
doi:10.1007/3-540-45718-6_85
fatcat:okjxmurakracfnep4jj2uvo7ei
An evaluation of message passing implementations on Beowulf workstations
1999
1999 IEEE Aerospace Conference. Proceedings (Cat. No.99TH8403)
One of the key building blocks of parallel applications on Beowulf workstations is a message passing library. ...
This paper examines a set of four message passing libraries available for Beowulf workstations, focusing on their features, implementation, reliability, and performance. ...
These "Pile-of-PCs" consist of a cluster of machines dedicated as nodes in a parallel processor, built entirely from commodity off the shelf parts, and employing a private system area network for communication ...
doi:10.1109/aero.1999.790188
fatcat:tcuqxy2fm5f6fbcielc26sgf2u
RoCL: A Resource Oriented Communication Library
[chapter]
2003
Lecture Notes in Computer Science
RoCL is a communication library that aims to exploit the low-level communication facilities of today's cluster networking hardware and to merge, via the resource oriented paradigm, those facilities and ...
the high-level degree of parallelism achieved on SMP systems through multi-threading. ...
Table 2 . 2 RoCL primitives to handle attribute lists.
Currently RoCL is only supported under LINUX. 2 GM runs on Myrinet, MVIA runs on SysKonnect and Intel and UDP runs on each of them. ...
doi:10.1007/978-3-540-45209-6_133
fatcat:hpxevls47nfnvg3fqp2np6ficm
Advanced Hybrid MPI/OpenMP Parallelization Paradigms for Nested Loop Algorithms onto Clusters of SMPs
[chapter]
2003
Lecture Notes in Computer Science
We implement the three variations and perform a number of micro-kernel benchmarks to verify the intuition that the hybrid programming model could potentially exploit the characteristics of an SMP cluster ...
The parallelization process of nested-loop algorithms onto popular multi-level parallel architectures, such as clusters of SMPs, is not a trivial issue, since the existence of data dependencies in the ...
The communication times follow a more irregular pattern, but on average they indicate a superior performance of the pure MPI model. ...
doi:10.1007/978-3-540-39924-7_30
fatcat:nsensostijb2bixnghpbxpyxau
High Performance Multi-paradigm Messaging Runtime Integrating Grids and Multicore Systems
2007
Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)
Users will want to compose heterogeneous components into single jobs and run seamlessly in both distributed fashion and on a future "Grid on a chip" with different subsets of cores supporting individual ...
Our work uses managed code (C#) and for AMD and Intel processors shows around a factor of 5 better performance than Java. ...
parallel problems have no significant inter-process communication and are often executed on a Grid. ...
doi:10.1109/e-science.2007.42
dblp:conf/eScience/QiuFYBCN07
fatcat:4brtlbdwfnchxk6vhf6ti7wkfa
An Introduction to Balder — An OpenMP Run-time Library for Clusters of SMPs
[chapter]
2008
Lecture Notes in Computer Science
The run-time library presented can be used on SMPs and clusters of SMPs and it will provide a shared address space on a cluster. ...
The performance of the library is evaluated and is shown to be competitive when compared to a commercial compiler from Intel. ...
Nguyen-Thai Nguyen-Phan has implemented parts of the work-sharing, memory allocation and distributed lock primitives. He has been instrumental in testing the library. The ...
doi:10.1007/978-3-540-68555-5_7
fatcat:75ow5zbzencyblsts3ypj2hkpy
Efficient load balancing for wide-area divide-and-conquer applications
2001
Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming - PPoPP '01
Divide-and-conquer programs are easily parallelized by letting the programmer annotate potential parallelism in the form of spawn and sync constructs. ...
Satin extends the Java language with two simple primitives for divide-and-conquer programming: spawn and sync. ...
We thank John Romein, Grégory Mounié, and Martijn Bot for their helpful comments on a previous version of this manuscript. ...
doi:10.1145/379539.379563
dblp:conf/ppopp/NieuwpoortKB01
fatcat:sqo7sxl4sjbmbbqpd2huacnhfm
Efficient load balancing for wide-area divide-and-conquer applications
2001
SIGPLAN notices
Divide-and-conquer programs are easily parallelized by letting the programmer annotate potential parallelism in the form of spawn and sync constructs. ...
Satin extends the Java language with two simple primitives for divide-and-conquer programming: spawn and sync. ...
We thank John Romein, Grégory Mounié, and Martijn Bot for their helpful comments on a previous version of this manuscript. ...
doi:10.1145/568014.379563
fatcat:hsnnqnwmbvctpnlyuxnepocz5e
« Previous
Showing results 1 — 15 out of 1,020 results