A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit <a rel="external noopener" href="http://christian-engelmann.info/publications/engelmann13toward.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
Toward a Performance/Resilience Tool for Hardware/Software Co-design of High-Performance Computing Systems
<span title="">2013</span>
<i title="IEEE">
<a target="_blank" rel="noopener" href="https://fatcat.wiki/container/3qrmigawsbhbljazwy2rerl5ni" style="color: black;">2013 42nd International Conference on Parallel Processing</a>
</i>
xSim is a simulation-based performance investigation toolkit that permits running high-performance computing (HPC) applications in a controlled environment with millions of concurrent execution threads, while observing application performance in a simulated extreme-scale system for hardware/software co-design. The presented work details newly developed features for xSim that permit the injection of MPI process failures, the propagation/detection/notification of such failures within the
<span class="external-identifiers">
<a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/icpp.2013.114">doi:10.1109/icpp.2013.114</a>
<a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/icpp/EngelmannN13.html">dblp:conf/icpp/EngelmannN13</a>
<a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/onbxzxytazaktejzc7e77gvwge">fatcat:onbxzxytazaktejzc7e77gvwge</a>
</span>
more »
... n, and their handling using application-level checkpoint/restart. These new capabilities enable the observation of application behavior and performance under failure within a simulated future-generation HPC system using the most common fault handling technique.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170812071454/http://christian-engelmann.info/publications/engelmann13toward.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext">
<button class="ui simple right pointing dropdown compact black labeled icon button serp-button">
<i class="icon ia-icon"></i>
Web Archive
[PDF]
<div class="menu fulltext-thumbnail">
<img src="https://blobs.fatcat.wiki/thumbnail/pdf/48/55/485598a1a3a937acdb83682b8c15a441983601d8.180px.jpg" alt="fulltext thumbnail" loading="lazy">
</div>
</button>
</a>
<a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/icpp.2013.114">
<button class="ui left aligned compact blue labeled icon button serp-button">
<i class="external alternate icon"></i>
ieee.com
</button>
</a>