Scheduling in Heterogeneous Distributed Computing Systems Based on Internal Structure of Parallel Tasks Graphs with Meta-Heuristics
The problem of scheduling parallel tasks graphs (PTGs) represented by directed acyclic graphs (DAGs) in heterogeneous distributed computing systems (HDCSs) is considered an nondeterministic polynomial time (NP) problem due to the diversity of characteristics and parameters, generally opposed, intended to be optimized. The PTGs are scheduled by a scheduler that determines the best location for the sub-tasks that constitute the PTGs and is responsible for allocating the resources of the HDCS to
... es of the HDCS to the sub-tasks of the PTGs. To optimize scheduling and allocations, the scheduler extracts characteristics from the internal structure of the PTGs. The prevailing characteristic in existing research is the critical path (CP), which is limited to providing execution paths of PTGs; considering this limitation, we extend the array method proposed in Velarde, which extracts two additional characteristics to the CP: the layering and the density of the graph for scheduling. These characteristics are represented as integer values of the PTGs to be scheduled; the values obtained from the characteristics are stored in arrays representing populations that are evaluated with the heuristic univariate marginal distribution algorithm (UMDA) and in terms of comparison with the genetic algorithm. With the best allocations produced by the algorithms, two performance parameters are evaluated: makespan and waiting time. The results indicate that when more PTGs characteristics are considered, resource allocations are optimized, and scheduling times are reduced. The results obtained with the heuristic algorithms show that UMDA provides shorter scheduling and allocation times compared with the genetic algorithm; UMDA widely distributes the sub-tasks in the clusters, whereas the genetic algorithm compacts the assignments of the PTGs in the clusters with a longer convergence time that translates into longer scheduling and allocation times. Extensive explanations of these conclusions are provided in this work, based on the conducted experiments.