THE MINADEPT CLUSTERING APPROACH FOR DISCOVERING REFERENCE PROCESS MODELS OUT OF PROCESS VARIANTS

CHEN LI, MANFRED REICHERT, ANDREAS WOMBACHER
2010 International Journal of Cooperative Information Systems  
During the last years a new generation of adaptive Process-Aware Information Systems (PAIS) has emerged, which enables dynamic process changes at runtime, while preserving PAIS robustness and consistency. Such adaptive PAIS allow authorized users to add new process activities, to delete existing activities, or to change pre-defined activity sequences during runtime. Both this runtime flexibility and process configurations at build-time, lead to a large number of process variants being derived
more » ... nts being derived from the same process model, but slightly differing in structure due to the applied changes. Generally, process variants are expensive to configure and difficult to maintain. This paper presents selected results from our MinAdept project. In particular, we provide a clustering algorithm that fosters learning from past process changes by mining a collection of process variants. As mining result we obtain a process model for which average distance to the process variant models becomes minimal. By adopting this process model as reference model in the PAIS, need for future process configuration and adaptation decreases. We have validated our clustering algorithm by means of a case study as well as comprehensive simulations. Altogether, our vision is to enable full process lifecycle support in adaptive PAIS. * This work was done in the MinAdept project, which has been supported by the Netherlands Organization for Scientific Research (NWO) under contract number 612.066.512. Backgrounds We first introduce basic notions needed in the following: Process Model : Let P denote the set of all sound (i.e., correct) process models. We denote a process model as sound if there are no deadlocks or unreachable activities in the process model 41,62 . In our context, a particular process model S = (N, E, . . .) ∈ P is defined in terms of an Activity Net 41 : N constitutes the set of activities {a 1 , . . . , a n } and E the set of control edges (i.e., precedence relations) linking them. b More precisely, Activity Nets cover the following fundamental process patterns: Sequence, AND-split, AND-join, XOR-split, XOR-join, and Loop 60 . c These patterns constitute the core set of any workflow specification language (e.g., WS-BPEL 4 and BPMN 5 ) and cover most of the process models we can find in practice 75,33 . Furthermore, based on these patterns we are able to compose more complex ones if required (e.g., an OR-split can be mapped to AND-and XOR-splits 37 ). Finally, when restricting process modeling to these basic process patterns, we obtain models that are better understandable and less erroneous 36,34 . A simple example of an Activity Net is depicted in Fig. 3a . For a detailed description of Activity Nets and relating correctness issues we refer to 41 . Block Structuring : To limit the scope, we assume Activity Nets to be blockstructured, i.e., sequences, branchings (based on the aforementioned split and join patterns), and loops are represented as blocks with well-defined start and end nodes. These blocks may be nested, but must not overlap; i.e., the nesting must be regular 41, 22 . In a process model S, a block may be a single activity, a self-contained part of S, or S itself. As example consider process model S from Fig. 3. Here {A}, {A,B}, {C,F}, and {A,
doi:10.1142/s0218843010002139 fatcat:4uunu5jrcnax3ei767nmpsguj4