A partition-based approach to structure similarity search

Xiang Zhao, Chuan Xiao, Xuemin Lin, Qing Liu, Wenjie Zhang
2013 Proceedings of the VLDB Endowment  
Graphs are widely used to model complex data in many applications, such as bioinformatics, chemistry, social networks, pattern recognition, etc. A fundamental and critical query primitive is to efficiently search similar structures in a large collection of graphs. This paper studies the graph similarity queries with edit distance constraints. Existing solutions to the problem utilize fixed-size overlapping substructures to generate candidates, and thus become susceptible to large vertex degrees
more » ... or large distance thresholds. In this paper, we present a partition-based approach to tackle the problem. By dividing data graphs into variable-size nonoverlapping partitions, the edit distance constraint is converted to a graph containment constraint for candidate generation. We develop efficient query processing algorithms based on the new paradigm. A candidate pruning technique and an improved graph edit distance algorithm are also developed to further boost the performance. In addition, a cost-aware graph partitioning technique is devised to optimize the index. Extensive experiments demonstrate our approach significantly outperforms existing approaches.
doi:10.14778/2732232.2732236 fatcat:yu4do4by7rgfhemjgm7edusf4u