Efficient parallel algorithms on restartable fail-stop processors

Paris C. Kanellakis, Alex A. Shvartsman
1991 Proceedings of the tenth annual ACM symposium on Principles of distributed computing - PODC '91  
We study efficient deterministic executions of parallel algorithms on restartable fail-stop CRCW PRAMs. We allow the PRAM processors to be subject to arbitrary stop failures and restarts, that are determined by an on-line adversary, and that result in loss of private memory but do not affect shared memory. For this model, we define and justify the complexity measures of: completed work, where processors are charged for completed fixed-size update cycles, and overhead ratio, which amortizes the
more » ... ork over necessary work and failures. This framework is a nontrivial extension of the fail-stop no-restart model of [KS 89]. We present a simulation strategy for any N processor PRAM on a restartable fail-stop P processor CRCW PRAM such that: it guarantees a terminating execution of each simulated N processor step, with O(log = N) overhead ratio, and O(min{N + Plog 2 N + M log N, N p 0 6 )) (sub-quadratic) completed work, where _f is the number of failures during this step's simulation. This strategy is work-optimal when the number of simulating processors is P < NI log 2 N and the total number of failures per each simulated N processor step is O(N/ log N). These results are based on a new algorithm for the Write-All problem "P processors write l's in an array of size N", together with a modification of the nain algorithm of [KS 89] and with the techniques in [KPS 90, Shy 89]. We observe that, on P = N restartable fail-stop processors, the Write-All problem requires fl(N log N) coinpleted work, and this lower bound holds even under the additional assumption that processors can read and locally process the entire shared memory at unit cost. Under this unrealistic assumption we have a matching tipper bound. The lower bound also applies to the expected completed work of randomized algorithms that are subject to on-line adversaries. Finally, we desribe a simple on-line adversary that causes inefficiency in many randomized algorithms.
doi:10.1145/112600.112603 dblp:conf/podc/KanellakisS91 fatcat:cvyk3523cvhodgp7hh76uojpci