A Self-Adaptive Process Mining Algorithm Based on Information Entropy to Deal with Uncertain Data
Process mining is a technology to gain knowledge of the business process by using the event logs and achieve a model of the process, which contributes to the detection and improvement of the business process. However, most existing process mining algorithms have drawbacks associated with managing uncertain data, and the method of using the frequency threshold alone needs to be enhanced. This paper improves correlation measures in heuristic mining to build a correlation matrix based on an
... d frequency matrix. Combined with the maximum entropy principle, a self-adaptive method to determine the threshold is given, which is used to remove the uncertain data relationship in the logs. Furthermore, this study identifies a selective and parallel structure through a modified frequency matrix, and we can get a Petri netbased process model from a directed graph. The recognition of parallel structures contributes to eliminating imbalances when calculating the threshold to deal with the uncertain data. Finally, this paper presents an algorithm framework for adaptively removing uncertain data. This study represents a new attempt to use entropy to remove uncertain data in the field of Business Process Management (BPM). The threshold to deal with the uncertain data does not need to set the parameters in advance. Therefore, the proposed algorithm is self-adaptive and universal. Experimental results show that the algorithm proposed in this study has a higher degree of behavioral and structural appropriateness, and fitness, for the uncertain log data compared to traditional algorithms. INDEX TERMS Process mining, entropy, maximum entropy principle, uncertain data, BPM.