A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit <a rel="external noopener" href="https://engineering.purdue.edu/dcsl/publications/papers/2014/checkpointing_jogc14.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
<i title="Springer Nature">
<a target="_blank" rel="noopener" href="https://fatcat.wiki/container/qrfhtnmiezbrbd2wgd2gragrxi" style="color: black;">Journal of Grid Computing</a>
In Fine-Grained Cycle Sharing (FGCS) systems, machine owners voluntarily share their unused CPU cycles with guest jobs, as long as their performance degradation is tolerable. However, unpredictable evictions of guest jobs lead to fluctuating completion times. Checkpoint-recovery is an attractive mechanism for recovering from such "failures". Today's FGCS systems often use expensive, high-performance dedicated checkpoint servers. However, in geographically distributed clusters, this may incur<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s10723-014-9297-4">doi:10.1007/s10723-014-9297-4</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/dnbs2tb3pfh7lamgqmqdr645ju">fatcat:dnbs2tb3pfh7lamgqmqdr645ju</a> </span>
more »... h checkpoint transfer latencies. In this paper we present a distributed checkpointing system called FALCON that uses available disk resources of the FGCS machines as shared checkpoint repositories. However, an unavailable storage host may lead to loss of checkpoint data. Therefore, we model the failures T. Z. Islam ( ) Lawrence of a storage host and develop a prediction algorithm for choosing reliable checkpoint repositories. We experiment with FALCON in the university-wide Condor testbed at Purdue and show improved and consistent performance for guest jobs in the presence of irregular resource availability.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170830055750/https://engineering.purdue.edu/dcsl/publications/papers/2014/checkpointing_jogc14.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/31/62/3162597229ab08f33ab33dd105133440ce437685.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s10723-014-9297-4"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>