Centralized versus Distributed Schedulers for Bag-of-Tasks Applications

O. Beaumont, L. Carter, J. Ferrante, A. Legrand, L. Marchal, Y. Robert
2008 IEEE Transactions on Parallel and Distributed Systems  
Multiple applications that execute concurrently on heterogeneous platforms compete for CPU and network resources. In this paper, we consider the problem of scheduling applications to ensure fair and efficient execution on a distributed network of processors. We limit our study to the case where communication is restricted to a tree embedded in the network, and the applications consist of a large number of independent tasks (Bags of Tasks) that originate at the tree's root. The tasks of a given
more » ... pplication all have the same computation and communication requirements, but these requirements can vary for different applications. The goal of scheduling is to maximize the throughput of each application while ensuring a fair sharing of resources between applications. We can find the optimal asymptotic rates by solving a linear programming problem that expresses all necessary problem constraints, and we show how to construct a periodic schedule from any linear program solution. For single-level trees, the solution is characterized by processing tasks with larger communication-to-computation ratios at children with larger bandwidths. For multilevel trees, this approach requires global knowledge of all application and platform parameters. For large-scale platforms, such global coordination by a centralized scheduler may be unrealistic. Thus, we also investigate decentralized schedulers that use only local information at each participating resource. We assess their performance via simulation and compare to an optimal centralized solution obtained via linear programming. The best of our decentralized heuristics achieves the same performance on about 2/3 of our test cases but is far worse in a few cases. Although our results are based on simple assumptions and do not explore all parameters (such as the maximum number of tasks that can be held on a node), they provide insight into the important question of fairly and optimally scheduling heterogeneous applications on heterogeneous grids.
doi:10.1109/tpds.2007.70747 fatcat:3mu6fjjhlrhrrbqr6fgdkbsjd4