5 Hits in 6.0 sec

A diskless checkpointing algorithm for super-scale architectures applied to the fast fourier transform

C. Engelmann, A. Geist
Proceedings of the International Workshop on Challenges of Large Applications in Distributed Environments, 2003.  
First, we discuss the method of diskless checkpointing, then we adapt this technique to super-scale architectures and finally we present results from an implementation of the Fast Fourier Transform that  ...  In this paper, we adapt the present technique of diskless checkpointing to such huge distributed systems in order to equip existing scientific algorithms with super-scalable fault-tolerance.  ...  Finally, we use the Fast Fourier Transform algorithm to demonstrate the super-scalable diskless checkpointing.  ... 
doi:10.1109/clade.2003.1209999 dblp:conf/clade/EngelmannG03 fatcat:q6j4w6hq3bcubpq6geiwwwt5hu

D6.4: Report on approaches to Petascaling

Mohammad Jowkar, Carlo Cavazzoni, Xu Guo, Giorgos Goumas
2009 Zenodo  
Furthermore each application has been ported and optimized to several different architectures to get a better understanding of the suitability of the applications on different architectures and vice versa  ...  The work done in task 6.4 will together with task 6.5 be used in task 6.3 to create a benchmark set.  ...  The code authors try to keep a uniform format throughout the application. Generally useful comments are found in most parts of the code, but mostly in German.  ... 
doi:10.5281/zenodo.6546112 fatcat:rsmdzoeqbbbdzoe2zkx3czi2ry

SuperWeb: research issues in Java-based global computing

Albert D. Alexandrov, Maximilian Ibel, Klaus E. Schauser, Chris J. Scheiman
1997 Concurrency Practice and Experience  
We propose a new infrastructure, SuperWeb, to harness global resources, such as CPU cycles or disk storage, and make them available to every user on the Internet.  ...  The Internet, in particular the World-Wide-Web, continues to expand at an amazing pace.  ...  We would like to thank Peter Cappello, Bernd Christiansen, Mihai Ionescu, and Michael Neary for their work on Javelin and Bjorn Birnir for insightful discussions on encrypted computing.  ... 
doi:10.1002/(sici)1096-9128(199706)9:6<535::aid-cpe307>;2-1 fatcat:soy7rrweuvfqhlr2p7uwypi6lm

D5.2: Best Practices for HPC Procurement and Infrastructure

Norbert Meyer, Marcin Lawenda
2013 Zenodo  
Specific areas of interest are analysed in depth in terms of the market they belong to and the general HPC landscape, with a particular emphasis on the European point of view.  ...  -1IP WP8), which have all sought to reach informed decisions within PRACE as a whole on the acquisition and hosting of HPC systems and infrastructure.  ...  In addition it is also not scaling as fast as the performance of the computing is scaled.  ... 
doi:10.5281/zenodo.6572412 fatcat:2bqftmr5zzb7na6pnlrlxxuqnu

Load-Balance and Fault-Tolerance for Massively Parallel Phylogenetic Inference

Klaus Lukas Hübner
We benchmark our algorithms for checkpointing and recovery. In our experiments, creating a checkpoint of the model parameters requires at most 72.0 ± 0.9 ms (400 ranks, 4,116 partitions).  ...  We extend RAxML-ng, a widely used tool to build phylogenetic trees, to mitigate hardware failures without user intervention. For this, we increase the checkpointing frequency.  ...  Researchers [72] , and Engelmann and Geist a Fast Fourier Transformation [29] that gracefully handle hardware faults. Kohl et al.  ... 
doi:10.5445/ir/1000124310 fatcat:wsq3luah2jblllz52rjcro7sxm