Utility functions for adaptively executing concurrent workflows
Concurrency and Computation
Workflows are widely used in applications that require coordinated use of computational resources. Workflow definition languages typically abstract over some aspects of the way in which a workflow is to be executed, such as the level of parallelism to be used or the physical resources to be deployed. As a result, a workflow management system has the responsibility of establishing how best to map tasks within a workflow to the available resources. As workflows are typically run over shared
... ces, and thus face unpredictable and changing resource capabilities, there may be benefit to be derived from adapting the task-to-resource mapping while a workflow is executing. This paper describes the use of utility functions to express the relative merits of alternative mappings; in essence, a utility function can be used to give a score to a candidate mapping, and the exploration of alternative mappings can be cast as an optimization problem. In this approach, changing the utility function allows adaptations to be carried out with a view to meeting different objectives. The contributions of this paper include: (i) a description of how adaptive workflow execution can be expressed as an optimization problem where the objective of the adaptation is to maximize a utility function; (ii) a description of how the approach has been applied to support adaptive workflow execution in execution environments consisting of multiple resources, such as grids or clouds, in which adaptations are coordinated across multiple workflows; and (iii) an experimental evaluation of the approach with utility measures based on response time and profit using the Pegasus workflow system. FUNCTIONS FOR ADAPTIVELY EXECUTING CONCURRENT WORKFLOWS 647 part of a workflow at a time (e.g. ), or by dynamically revising compilation decisions that give rise to a concrete workflow while it is executing (e.g.    ). In principle, any decision that was made statically during workflow compilation can be revisited at runtime  . Adaptations can be performed for different reasons, including (i) changes to the executing environment, (ii) changes to the workflow description and (iii) adaptation to user (workflow submitter) requirements. Changes to the execution environment can be resources becoming available, computation bottlenecks being detected or resources failing. The workflow description can be changed in order to take advantage of new data or services becoming available. A user may decide to change the overall task to, for example, require providence data to be collected or use specific computational services. Furthermore, adaptations can be performed for different reasons, including prospective (to improve future performance), reactive (to react to previous results) and altruistic (to aid other areas of the workflow). In common with adaptive and autonomic computing techniques in other areas , in this paper, adaptive workflow execution involves a feedback loop, the implementation of which differs from platform to platform, but in which various phases recur: monitoring records information about workflow progress and/or the execution environment; an analysis activity identifies potential problems and/or opportunities; a planning phase explores alternatives to the current evaluation strategy; and, if adapting is considered beneficial, an execution step takes place whereby a revised evaluation strategy is adopted. Adaptive workflow execution techniques may differ in all these phases  . In this paper: monitoring captures progress information in the form of job completion times and queue lengths; analysis identifies where monitoring information departs from expectations; planning uses utility functions to consider how different allocations of tasks to resources may give rise to higher utility (in the form of reduced response times or increased profits); and execution applies the updated resource allocations, reusing work carried out to date. A software framework has been developed that assigns to each of these phases software components, built by the authors, that are generic by design. These components can then be instantiated to become the autonomic manager of a specific software artefact simply by being provided with specifications of what to monitor, how to analyze the monitoring information, how to plan an adaptation and how to execute the latter. In the case of this paper, the specific managed artefact is a workflow engine, comprising a compiler from abstract to concrete workflows and a job manager. Therefore, the specifications that are passed into the adaptivity framework to make its behavior specific to the managed artefact cause the resulting system to behave in an autonomic manner. The context for this work is illustrated in Figure 1 . In essence, workflows are submitted to an autonomic workflow mapper, which adaptively assigns the jobs in the workflows to execution sites. Each execution site queues jobs for execution on one or more computational nodes. Given some objective, such as to minimize total execution times or, more generally, to optimize for some Quality of Service (QoS) target, the autonomic workflow mapper must determine which jobs to assign to each of the available execution sites, revising the assignment during workflow execution on the basis of feedback on the progress of the submitted jobs. In this paper we describe two different utility measures, namely response time and profit, to capture QoS targets within a consistent framework. To the best of our knowledge, our work is the first to make use of the combination of utility functions and optimization algorithms for adaptive workflow execution. In so doing, we bring to the table a declarative approach to dynamic scheduling in which an optimization algorithm proposes assignments of tasks to resources that maximize utility, following the strategy of Kephart and Das  . We note that the terms utility and utility function are used quite widely; in general terms, a utility function is a function that computes a value that represents the desirability of a state. However, approaches that seek to maximize some measure of utility differ in the way in which utility informs decision making. For example, Huebscher and McCann  use utility functions to express application requirements for component selection, in a setting where each component provides the parameters that enable its utility to be computed (i.e. there is no search problem as such, the utility function is essentially metadata that informs component selection). By contrast, in workflow scheduling, Yu et al.  , use a utility measure to give a value to assignments of tasks to resources, in an adaptive scheduling system where the overall problem is divided into a number of Adaptive Key Job Queued Job Executing Figure 6 . Experiment 1: Montage workflow progress plots, showing when each job (task) is queued and executed, with an adaptive workflow execution strategy using the utility based on the response time.