Non-strict independence-based program parallelization using sharing and freeness information

Daniel Cabeza Gras, Manuel V. Hermenegildo
2009 Theoretical Computer Science  
Keywords: Parallelism Automatic parallelization Abstract interpretation Abstract domains Sharing and freeness Non-strict independence Parallelizing compilers Declarative languages Logic programming The current ubiquity of multi-core processors has brought renewed interest in program parallelization. Logic programs allow studying the parallelization of programs with complex, dynamic data structures with (declarative) pointers in a comparatively simple semantic setting. In this context, automatic
more » ... parallelizers which exploit and-parallelism rely on notions of independence in order to ensure certain efficiency properties. "Non-strict" independence is a more relaxed notion than the traditional notion of "strict" independence which still ensures the relevant efficiency properties and can allow considerable more parallelism. Non-strict independence cannot be determined solely at run-time ("a priori") and thus global analysis is a requirement. However, extracting non-strict independence information from available analyses and domains is non-trivial. This paper provides on one hand an extended presentation of our classic techniques for compile-time detection of non-strict independence based on extracting information from (abstract interpretationbased) analyses using the now well understood and popular Sharing + Freeness domain. This includes algorithms for combined compile-time/run-time detection which involve special run-time checks for this type of parallelism. In addition, we propose herein novel annotation (parallelization) algorithms, URLP and CRLP, which are specially suited to non-strict independence. We also propose new ways of using the Sharing + Freeness information to optimize how the run-time environments of goals are kept apart during parallel execution. Finally, we also describe the implementation of these techniques in our parallelizing compiler and recall some early performance results. We provide as well an extended description of our pictorial representation of sharing and freeness information. favorable characteristics, stemming from their high level and declarative nature, and at the same time they pose, in a semantically clean and well-understood setting, challenges which relate closely to the most difficult scenarios faced in traditional parallehzation [27]. In particular, interesting challenges faced during the parallehzation of logic programs include the presence of dynamically allocated, complex data structures containing "(declarative) pointers" (logical variables), nontrivial notions of independence, the presence of highly irregular computations and dynamic control flow, and having to deal with speculative computations and search. As a result, advances in the parallehzation of logic programs also shed light on the parallehzation of current and future imperative languages. Logic programs exhibit several kinds of parallelism [13, 24], among which or-and and-parallelism are the most exploited in practice. In this paper we concentrate on and-parallelism, which manifests itself in applications in which a given problem can be divided into a number of independent sub-problems. For example, it appears in algorithms where independent, possibly recursive calls or loop iterations can be executed in parallel ( simple, well-known examples are quick-sort or matrix multiplication). Some examples of systems which exploit and-parallelism are &-Prolog [10,26,29] (and, more recently, its successor Ciao [28,30]), ROPM [56], AO-WAM [23], ACE [54,55], DDAS/Prometheus [58,59], systems based on the "Extended" Andorra Model [63] such as AKL [41], etc. (please see their references and [24] for other related systems). The objective of the parallehzation process performed by a parallelizing compiler is to uncover as much as possible of the available parallelism in the program, while guaranteeing that the correct results are computed -i.e., correctness-and that other observable characteristics of the program, such as execution time, are improved (speedup) or, at the minimum, preserved (no-slowdown) -i.e., efficiency. A central issue is, then, under which conditions two parts of a (logic) program can be correctly and efficiently parallelized. All of the systems exploiting and-parallelism mentioned above rely on some notion of independence (also referred to as "stability" [25]) among non-deterministic goals being run in and-parallel in order to ensure these important efficiency properties. 1 Two basic notions of independence for logic programs are strict and nonstrict independence [33][34][35]. Other more general notions have been developed based directly on search space preservation and which are applicable to constraint logic programs [17,18], but herein we concentrate on the former classic notions for the Herbrand domain. Strict independence Strict independence corresponds to the traditional notion of independence, normally applied to goals [13, 21,29]: two goals gi andg 2 are said to be strictly independent for a substitution 9 iff vox {gi9)C\var(g 2 9) = 0, where varig) is the set of variables that appear in g. Accordingly, n goals g\,..., g n are said to be strictly independent for a substitution 9 if they are pairwise strictly independent for 9. Parallehzation of strictly independent goals has the property of preserving the search space of the goals involved so that correctness and efficiency of the original program (using a left to right computation rule) are maintained and a no slow-down condition can be ensured [33][34][35]. 2 A convenient characteristic of strict independence is that it is an "a priori" condition, i.e., it can be tested at run-time ahead of the execution of the goals. Furthermore, tests for strict independence can be expressed directly in terms of groundness and independence of the variables involved. This allows relatively simple compile-time parallehzation by introducing runtime tests in the program, i.e.
doi:10.1016/j.tcs.2009.07.044 fatcat:mlyqfbx37bgrvah7tftagi7rni