Subgraph Matching with Set Similarity in a Large Graph Database

Liang Hong, Lei Zou, Xiang Lian, Philip S. Yu
2015 IEEE Transactions on Knowledge and Data Engineering  
In real-world graphs such as social networks, Semantic Web and biological networks, each vertex usually contains rich information, which can be modeled by a set of tokens or elements. In this paper, we study a subgraph matching with set similarity (SMS 2 ) query over a large graph database, which retrieves subgraphs that are structurally isomorphic to the query graph, and meanwhile satisfy the condition of vertex pair matching with the (dynamic) weighted set similarity. To efficiently process
more » ... ficiently process the SMS 2 query, this paper designs a novel lattice-based index for data graph, and lightweight signatures for both query vertices and data vertices. Based on the index and signatures, we propose an efficient two-phase pruning strategy including set similarity pruning and structure-based pruning, which exploits the unique features of both (dynamic) weighted set similarity and graph topology. We also propose an efficient dominating-set-based subgraph matching algorithm guided by a dominating set selection algorithm to achieve better query performance. Extensive experiments on both real and synthetic datasets demonstrate that our method outperforms state-of-the-art methods by an order of magnitude. Index Terms-subgraph matching, set similarity, graph database, index ! • Liang Hong is with the State
doi:10.1109/tkde.2015.2391125 fatcat:y5owepamfjfpfphbpfdnvfveua