On the Usefulness of Weight-Based Constraints in Frequent Subgraph Mining [chapter]

Frank Eichinger, Matthias Huber, Klemens Böhm
2010 Research and Development in Intelligent Systems XXVII  
Frequent subgraph mining is an important data-mining technique. In this paper we look at weighted graphs, which are ubiquitous in the real world. The analysis of weights in combination with mining for substructures might yield more precise results. In particular, we study frequent subgraph mining in the presence of weight-based constraints and explain how to integrate them into mining algorithms. While such constraints only yield approximate mining results in most cases, we demonstrate that
more » ... results are useful nevertheless and explain this effect. To do so, we both assess the completeness of the approximate result sets, and we carry out application-oriented studies with real-world data-analysis problems: software-defect localization, weighted graph classification and explorative mining in logistics. Our results are that the runtime can improve by a factor of up to 3.5 in defect localization and classification and 7 in explorative mining. At the same time, we obtain an even slightly increased defect-localization precision, stable classification precision and obtain good explorative mining results. Definition 7. A lower bound predicate c l for a pattern p is a predicate with the following structure: An upper bound predicate c u in turn is as follows: c u (p) := ( e 2 ∈ E(p) : measure(e 2 ) > t u ) ∨ (|p| < size min ) A weight-based constraint, applied to a pattern p, is a set containing c l , c u , or both, connected conjunctively.
doi:10.1007/978-0-85729-130-1_5 dblp:conf/sgai/EichingerHB10 fatcat:etdvl2r3hnfzblll3hoabjsuzi