Dynamic Load Balancing of Parallel Computational Iterative Routines on Platforms with Memory Heterogeneity [chapter]

David Clarke, Alexey Lastovetsky, Vladimir Rychkov
2011 Lecture Notes in Computer Science  
High performance of data-parallel applications on heterogeneous platforms can be achieved by partitioning the data in proportion to the speeds of processors. It has been proven that the speed functions built from a history of time measurements better reflect different aspects of heterogeneity of processors. However, existing data partitioning algorithms based on functional performance models impose some restrictions on the shape of speed functions, which are not always satisfied if we try to
more » ... roximate the real-life measurements accurately enough. This paper presents a new data partitioning algorithm that applies multidimensional solvers to numerical solution of the system of non-linear equations formalizing the problem of optimal data partitioning. This algorithm relaxes the restrictions on the shape of speed functions and uses the Akima splines for more accurate and realistic approximation of the real-life speed functions. The better accuracy of the approximation in its turn results in a more optimal distribution of the computational load between the heterogeneous processors.
doi:10.1007/978-3-642-21878-1_6 fatcat:h4hlt2ib7ngl3aorgcoozqlm5a