KlusTree: Clustering Answer Trees from Keyword Search on Graphs [article]

Madhulika Mohanty, Maya Ramanath
2017 arXiv   pre-print
Graph structured data on the web is now massive as well as diverse, ranging from social networks, web graphs to knowledge-bases. Effectively querying this graph structured data is non-trivial and has led to research in a variety of directions -- structured queries, keyword and natural language queries, automatic translation of these queries to structured queries, etc. We are concerned with a class of queries called relationship queries, which are usually expressed as a set of keywords (each
more » ... ord denoting a named entity). The results returned are a set of ranked trees, each of which denotes relationships among the various keywords. The result list could consist of hundreds of answers. The problem of keyword search on graphs has been explored for over a decade now, but an important aspect that is not as extensively studied is that of user experience. We propose KlusTree, which presents clustered results to the users instead of a list of all the results. In our approach, the result trees are represented using language models and are clustered using JS divergence as a distance measure. We compare KlusTree with the well-known approaches based on isomorphism and tree-edit distance based clustering. The user evaluations show that KlusTree outperforms the other two in providing better clustering, thereby enriching user experience, revealing interesting patterns and improving result interpretation by the user.
arXiv:1705.09808v1 fatcat:zerrbzxvafhttfznpfsgjsytka