Algorithms and Complexity [chapter]

Peter Brucker, Sigrid Knust
2011 Complex Scheduling  
Contents Foreword 1 Distributed platforms and communication models. Distributed-memory parallel computing platforms pose many challenges to the algorithm designer and the programmer. An obvious factor contributing to this complexity is the need for network communication, whose performance is difficult to model in a way that is both precise and conducive to understanding the performance of algorithms. In light of the complexity of performance modeling for network communications, the vast
more » ... of scheduling works and results address a very simple model which assumes that there is no contention for network links. In other words, a processor can send distinct messages to a thousand of processors at the same speed as if there were a single message! Recent papers [63, 67, 111] suggest to take communication contention into account. Among these extensions, scheduling heuristics are considered in [16] , in which each processor can communicate with at most one other processor at a given timestep (one-port model, [73, 85, 23] ). In Chapter 2, we extensively discuss various realistic communication models on which we conduct our study. Dynamic platforms are characterized by their larger size, and greater degree of heterogeneity. The resources of these platforms (topology, message routes, etc.) and their characteristics are assumed to be known by a centralized control mechanism, even though they change over time. Radical changes are caused by failures, but the performance of a processor can also be slightly reduced because, for instance, another user launched a new process onto this resource. A typical example of such platform is a general purpose computational grid [51], or a set of resources provided by a team of users that can change significantly over time, such as in volunteer computing [83] . Scheduling techniques which aim at dealing with the dynamic nature of such platforms are extensively discussed in Chapter 4. Note that a scheduling algorithm for such platforms may aim at optimizing the reliability of the schedule, in addition to the usual performance criteria that have been discussed so far, such as the application throughput and the latency (or makespan). Alternative platforms. The scheduling of pipelined computations can also be conducted on special-purpose architectures and FPGA arrays, see for instance the representative work by Fabiani and Lavenier [48] . They study the placement of linear computations onto reconfigurable arrays. Another line of work is related to the design of fault-tolerant or power-aware mapppings for embedded systems. Representative examples are [133, 8] . We do not target such special-purpose architectures, but rather limit our study to large-scale distributed platforms.
doi:10.1007/978-3-642-23929-8_2 fatcat:efxs6lrvbvfuvocbrelyqdpyma