11 Hits in 4.1 sec

Evaluating MPI collective communication on the SP2, T3D, and Paragon multicomputers

Kai Hwang, Choming Wang, Cho-Li Wang
Proceedings Third International Symposium on High-Performance Computer Architecture  
We evaluate the architectural support of collective communication operations on the IBM SP2, Cray T3D, and Intel Paragon.  ...  For total exchange with 64 nodes, the T3D, Paragon, and SP2 achieved an aggregated bandwidth of 1.745,0.879, and 0. 818 GByteds, respectively.  ...  We want also to thank the computing support groups at HKU and USC for their technical assistance in the benchmark experiments.  ... 
doi:10.1109/hpca.1997.569646 dblp:conf/hpca/HwangWW97 fatcat:kymxez2vhnh3pitg4nca45o43m

Resource scaling effects on MPP performance: the STAP benchmark implications

Kai Hwang, Choming Wang, Cho-Li Wang, Zhiwei Xu
1999 IEEE Transactions on Parallel and Distributed Systems  
For MPP users, the scaling results can be applied to partition a large workload for SPMD execution or to minimize the software overhead in collective communication or remote memory update operations.  ...  The Intel Paragon trails far behind due to slow processors used and excessive latency experienced in passing messages.  ...  The research was supported by Hong Kong Research Grants Council grants HKU 2/96C, HKU 7022/97E, HKU 548/96E, and HKU 7030/98E and by the development fund of the Area-of-Excellence in Information Technology  ... 
doi:10.1109/71.770197 fatcat:a4kn4bf5zzbl7bvljivnako3za

Scalable parallel computers for real-time signal processing

Kai Hwang, Zhiwei Xu
1996 IEEE Signal Processing Magazine  
In particular, we evaluate the IBM SP2 at MHPCC [33], the Intel Paragon at SDSC [38], the Cray T3D at Cray Eagan Center [I], and the Cray T3E and ASCI TeraFLOP system recently proposed by Intel [32].  ...  Our experiences in porting the MITLincoln Laboratory STAP (space-time adaptive processing) benchmark programs onto the SP2, T3D, and Paragon are reported.  ...  The Project was supported by a research subcontract from MIT Lincoln Laboratory to USC. The revision of the paper was done a t the Universily of Hong Kong, subsequently. We  ... 
doi:10.1109/79.526898 fatcat:lqng5sb2rvei5jedkzgkl5jz6y

Wide-area implementation of the Message Passing Interface

Ian Foster, Jonathan Geisler, William Gropp, Nicholas Karonis, Ewing Lusk, George Thiruvathukal, Steven Tuecke
1998 Parallel Computing  
We describe how these various mechanisms are supported in the Nexus implementation of MPI and present performance results for this implementation on multicomputers and networked systems.  ...  This implementation has been constructed by extending the Argonne MPICH implementation of MPI to use communication services provided by the Nexus communication library and authentication, resource allocation  ...  Acknowledgments Our work on Nexus and Globus is a joint e ort with Carl Kesselman and his colleagues at the USC Information Sciences Institute.  ... 
doi:10.1016/s0167-8191(98)00075-1 fatcat:3vnkmmkoyrghnhnxst37fcztqm

Cluster Computing [chapter]

Mark Baker, John Brooke, Ken Hawick, Rajkumar Buyya
2001 Lecture Notes in Computer Science  
in software development, where the usage of the workstation revolves around the edit, compile, debug and test cycle.  ...  Typically, there are three types of "owner": • Ones who use their workstations for sending and receiving mail or preparing papers, such as administrative staff, librarian, theoreticians, etc. • Ones involved  ...  Future work: • Further development of the adaptive collective communications − based on the MPI collective communications operations • Develop new distributed object classes and applications. • Implement  ... 
doi:10.1007/3-540-44681-8_100 fatcat:cr6rpynstjgufacwbmciciriha

The Nexus Approach to Integrating Multithreading and Communication

Ian Foster, Carl Kesselman, Steven Tuecke
1996 Journal of Parallel and Distributed Computing  
In this paper, we address the question of how to integrate threads and communication in high-performance distributed-memory systems.  ...  We report the results of performance studies conducted using a Nexus implementation; these results indicate that Nexus mechanisms can be implemented efficiently on commodity hardware and software systems  ...  Acknowledgments We are grateful to Hubertus Franke, John Garnett, Jonathan Geisler, David Kohr, Tal Lancaster, Robert Olson, and James Patton for their input to the Nexus design and implementation.  ... 
doi:10.1006/jpdc.1996.0108 fatcat:kp25veq36fcm5mpv6rzq655xbe

MPI on the I-WAY: a wide-area, multimethod implementation of the Message Passing Interface

I. Foster, J. Geisler, S. Tuecke
Proceedings. Second MPI Developer's Conference  
However, the wide-area environment introduces challenging problems for the MPI implementor, because of the heterogeneity of both the underlying physical infrastructure and the authentication and software  ...  Nexus provides automatic configuration mechanisms that can be used to select and configure authentication, process creation, and communication mechanisms in heterogeneous systems.  ...  (IBM SP, Intel Paragon, Cray T3D, etc.), shared-memory multiprocessors (SGI Challenge, Convex Exemplar), and vector multiprocessors (Cray C90, Y-MP).  ... 
doi:10.1109/mpidc.1996.534089 fatcat:kdqsx23jyzbtxbtiabqcebyt3e

Toward optimal complete exchange on wormhole-routed tori

Yu-Chee Tseng, Sze-Yao Ni, Jang-Ping Sheu
1999 IEEE transactions on computers  
Numerical analysis and experiment both show that significant improvement can be obtained by our scheme on total communication latency over existing results.  ...  AbstractÐIn this paper, we propose new routing schemes to perform all-to-all personalized communication (or known as complete exchange) in wormhole-routed, one-port tori.  ...  A preliminary version of this paper appeared in the Proceedings of the 1997 International Conference on Parallel and Distributed Systems [33] .  ... 
doi:10.1109/12.805156 fatcat:7yxyevmwnrgyre5qn4lp4lhdze

Theory and practice in parallel job scheduling [chapter]

Dror G. Feitelson, Larry Rudolph, Uwe Schwiegelshohn, Kenneth C. Sevcik, Parkson Wong
1997 Lecture Notes in Computer Science  
The scheduling of jobs on parallel supercomputer is becoming the subject of much research. However, there is concern about the divergence of theory and practice.  ...  We review theoretical research in this area, and recommendations based on recent results.  ...  The goal of the PSCHED API is to allow a site to write a scheduler that could schedule a variety of parallel jobs: MPI-2, PVM, and SMP multi-tasking jobs to run on a collection of di erent machines.  ... 
doi:10.1007/3-540-63574-2_14 fatcat:amfqcbwutbh45l3hnjrjteejjq

Advanced theory and practice for high performance computing and communications

Geoffrey Fox
2011 Concurrency and Computation  
We review possible and probable industrial applications of HPCC focusing on the software and hardware issues.  ...  The software models span HPF and data parallelism, to distributed information systems and object/data ow parallelism on the Web.  ...  such as IBM SP2, Intel's Paragon, Cray's T3D, as well as networks of workstations.  ... 
doi:10.1002/cpe.1863 fatcat:dxzlqgakunhbpdkabvaktvwbay

19th CERN School of Computing [article]

Carlo E Vandoni
Abstract After a brief overview of image science and image processing, we concentrate on the topic of image enhancement, restoration, and reconstruction, and offer three insights: (i) Severely degraded  ...  Abstract A review is given of the use of neural networks for nonlinear mapping of high dimensional data on lower dimensional structures. Both, unsupervised and supervised techniques are considered.  ...  and HPF+, which led to the results reported on in this paperthe financial support of the European Commission for those projects is also gratefully acknowledged.  ... 
doi:10.5170/cern-1996-008 fatcat:un3kblsloja45p2vzquj4weaby