Filters








19 Hits in 21.0 sec

WildFire: a scalable path for SMPs

E. Hagersten, M. Koster
1999 Proceedings Fifth International Symposium on High-Performance Computer Architecture  
We would like to thank the entire team for tireless and enthusiastic work during the WildFire hardware and software implementation.  ...  Acknowledgments The WildFire architecture was developed by Sun's High-End Server Engineering group, based in Massachusetts.  ...  We call such a scalable technology multiple SMP (MSMP). The rules for designing an MSMP differ somewhat from other scalable systems.  ... 
doi:10.1109/hpca.1999.744361 dblp:conf/hpca/HagerstenK99 fatcat:oqvwufwtozgv3f76v4dqsu5hxu

Case study: wildfire visualization

J. Ahrens, P. McCormick, J. Bossert, J. Reisner, J. Winterkamp
Proceedings. Visualization '97 (Cat. No. 97CB36155)  
In the initial phase of this project, scientists at Los Alamos are developing computer models to predict the spread of a wildfire.  ...  Visualization of the results of the wildfire simulation will be used by scientists to assess the quality of the simulation and eventually by fire personnel as a visual forecast of the wildfire's evolution  ...  Thanks to Genevieve Fox for creating the texture maps. We acknowledge the Advanced Computing Laboratory of Los Alamos National Laboratory, Los Alamos, NM 87545.  ... 
doi:10.1109/visual.1997.663919 dblp:conf/visualization/AhrensMBRW97 fatcat:qpgff5igk5dptflagfp26zeefe

Using multirail networks in high-performance clusters

Salvador Coll, Eitan Frachtenberg, Fabrizio Petrini, Adolfy Hoisie, Leonid Gurvits
2003 Concurrency and Computation  
In addition we propose a hybrid algorithm that combines the benefits of the local-dynamic for short messages with those of the dynamic algorithm for large messages.  ...  The Pittsburgh Supercomputing Center (PSC), 1 the second largest supercomputer in the world for unclassified research A shorter version of this paper and results for the static rail allocation can be found  ...  Acknowledgements We thank José Duato for spearheading the project, for pointing out the limitations of the static approach and suggesting the dynamic allocation strategy as a promising venue of research  ... 
doi:10.1002/cpe.725 fatcat:mptdvwlyxzfjxcqlb67g37dziq

THROOM – Supporting POSIX Multithreaded Binaries on a Cluster [chapter]

Henrik Löf, Zoran Radović, Erik Hagersten
2003 Lecture Notes in Computer Science  
The distributed threads execute in a global shared address space made coherent by a fine-grain SW-DSM layer.  ...  We also present THROOM, a proof-of-concept implementation that runs unmodified Pthread binaries on a virtual cluster modeled as standard UNIX processes.  ...  Implementation Details We have implemented the THROOM system on a 2-node Sun WildFire prototype SMP cluster [5, 6] .  ... 
doi:10.1007/978-3-540-45209-6_105 fatcat:nsmxiyz5abhariedzcaqhdmmli

Removing the overhead from software-based shared memory

Zoran Radović, Erik Hagersten
2001 Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '01  
It demonstrates a protocol-handling overhead below a microsecond for all the actions involved in a remote load operation, to be compared to the fastest implementation to date of around ten microseconds  ...  This not only removes the asynchronous overhead, but also makes use of a processor that otherwise would stall.  ...  ) for help with couple of PARMACS applications, Sverker Holmgren and Henrik Löf (Department of Scientific Computing, Uppsala University) for providing access to the Sun Orange system.  ... 
doi:10.1145/582034.582090 dblp:conf/sc/RadovicH01 fatcat:tggugyxjt5f47kojsm7tuijc4q

Shared Memory Multiprocessors [chapter]

2004 Parallel Computing on Heterogeneous Networks  
A perfectly scalable operating system should increase its throughput linearly with the number of processors. There exist two conceptual approaches to developing a scalable operating system.  ...  The switching strategy can be either circuit switching, in which a path from the source to the destination is established and reserved before the actual data transfer, or packet switching, in which packets  ... 
doi:10.1002/0471654167.ch3 fatcat:dvaj7kmetfgr7bkmdrmvzljwda

Cache-only memory architectures

F. Dahlgren, J. Torrellas
1999 Computer  
Conse- quently, for an application to attain high performance,  the local memory module must satisfy a large fraction  of the cache misses.  ...  If the pro- gram's memory access patterns are too complicated for  the software to understand, individual data structures  may not end up being placed in the memory module of  the node that access  ...  Hagersten E, Koster M () WildFire: a scalable path for SMPs.  In: International symposium on high-performance computer  architecture, Orlando, January   .  ... 
doi:10.1109/2.769448 fatcat:ozadfmvoyne5jczmnm42x4ukza

Architecture and design of AlphaServer GS320

Kourosh Gharachorloo, Madhu Sharma, Simon Steely, Stephen Van Doren
2000 SIGPLAN notices  
The current implementation supports up to 8 such nodes for a total of 32 processors.  ...  These techniques allow us to generate commit events (which are used for ordering purposes) well in advance of formulating the reply to a transaction.  ...  Finally, we thank the anonymous reviewers for their comments.  ... 
doi:10.1145/356989.356991 fatcat:miinmtuqdfaevichpnztjrdqva

Selective, accurate, and timely self-invalidation using last-touch prediction

An-Chow Lai, Babak Falsafi
2000 Proceedings of the 27th annual international symposium on Computer architecture - ISCA '00  
The key behind accurate last-touch prediction is tracebased correlation, associating a last-touch with the sequence of instructions (i.e., a trace) touching the block from a coherence miss until the block  ...  Correlating instructions enables an LTP to identify a last-touch to a memory block uniquely throughout an application's execution.  ...  DSM servers offer a scalable performance path beyond symmetric multiprocessors (SMPs) [9, 4] by maintaining a compatible programming interface and allowing a large number of processors to share a single  ... 
doi:10.1145/339647.339669 fatcat:hpwufbdwl5cpza6rzo32pavksu

Selective, accurate, and timely self-invalidation using last-touch prediction

An-Chow Lai, Babak Falsafi
2000 SIGARCH Computer Architecture News  
The key behind accurate last-touch prediction is tracebased correlation, associating a last-touch with the sequence of instructions (i.e., a trace) touching the block from a coherence miss until the block  ...  Correlating instructions enables an LTP to identify a last-touch to a memory block uniquely throughout an application's execution.  ...  DSM servers offer a scalable performance path beyond symmetric multiprocessors (SMPs) [9, 4] by maintaining a compatible programming interface and allowing a large number of processors to share a single  ... 
doi:10.1145/342001.339669 fatcat:3tz52dfasba5bmofu5jjjxs7ji

Memory sharing predictor

An-Chow Lai, Babak Falsafi
1999 SIGARCH Computer Architecture News  
This paper also presents the first design and evaluation for a speculative coherent DSM using pattern-based predictors.  ...  We identify simple techniques and mechanisms to trigger prediction timely and perform speculation for remote read accesses.  ...  Acknowledgements We would like to thank Mark Hill, Alain Kägi, and the anonymous referees for their valuable feedback on earlier drafts of this paper.  ... 
doi:10.1145/307338.300994 fatcat:qxpyj5zu6rertosfduz3cfirwm

Runtime vs. Manual Data Distribution for Architecture-Agnostic Shared-Memory Programming Models

Dimitrios S. Nikolopoulos, Eduard Ayguadé, Constantine D. Polychronopoulos
2002 International journal of parallel programming  
This paper compares data distribution methodologies for scaling the performance of OpenMP on NUMA architectures.  ...  The results provide a proof of concept that it is possible to scale a portable shared-memory programming model up to more than 100 processors, without modifying the API and without exposing architectural  ...  The target architecture may be a small-scale SMP server, a ccNUMA supercomputer, a cluster running software distributed shared-memory (DSM) (2, 3) or even a multithreaded processor.  ... 
doi:10.1023/a:1019899812171 dblp:journals/ijpp/NikolopoulosAP02 fatcat:3neic6ykybhkzpytgjf4kqpija

A case for user-level dynamic page migration

Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesús Labarta, Eduard Ayguadé
2000 Proceedings of the 14th international conference on Supercomputing - ICS '00  
We also present a new technique for preventing page pingpong and a mechanism for monitoring the performance of page migration algorithms at runtime and tuning their sensitive parameters accordingly.  ...  Our experimental evidence on a SGI Origin2000 shows that unmodified OpenMP codes linked with our runtime system for dynamic page migration are effectively immune to the page placement strategy of the operating  ...  (SMP) nodes are interconnected via a Permission to make digital or hard copies of all or part of tiffs work for personal or classroom use is granted without fee provided that copies are not made or distributed  ... 
doi:10.1145/335231.335243 dblp:conf/ics/NikolopoulosPPLA00 fatcat:cc4ck37on5fdvdygx2f54mpn64

Open Source Framework for Enabling HPC and Cloud Geoprocessing Services

José Miguel Montañana, Paolo Marangino, Antonio Hervás
2020 Agris on-line Papers in Economics and Informatics  
Geoprocessing is a set of tools that can be used to efficiently address several pressing chal-lenges for the global economy ranging from agricultural productivity, the design of transport networks, to  ...  secure access to large-scale HPC-enabled virtual industrial experimentation environment empowering scalable big data analytics), and EOPEN (opEn interOperable Platform for unified access and analysis  ...  We wish to thank Naweiluo Zhou for providing an up to date description of the CYBELE project.  ... 
doi:10.7160/aol.2020.120405 fatcat:rdglrsvkuzcklk573anb4oschq

Using multirail networks in high-performance clusters

S. Coll, E. Frachtenberg, F. Petrini, A. Hoisie, L. Gurvits
Proceedings 42nd IEEE Symposium on Foundations of Computer Science  
In addition we propose a hybrid algorithm that combines the benefits of the local-dynamic for short messages with those of the dynamic algorithm for large messages.  ...  The Pittsburgh Supercomputing Center (PSC), 1 the second largest supercomputer in the world for unclassified research A shorter version of this paper and results for the static rail allocation can be found  ...  Acknowledgements We thank José Duato for spearheading the project, for pointing out the limitations of the static approach and suggesting the dynamic allocation strategy as a promising venue of research  ... 
doi:10.1109/clustr.2001.959946 dblp:conf/cluster/CollFPHG01 fatcat:iqy6ge4ponajtdteu7nbv2hkma
« Previous Showing results 1 — 15 out of 19 results