G-Index Model: A generic model of index schemes for top-k spatial-keyword queries

Hyuk-Yoon Kwon, Haixun Wang, Kyu-Young Whang
2014 World wide web (Bussum)  
A top-k spatial-keyword query returns the k best spatio-textual objects ranked based on their proximity to the query location and relevance to the query keywords. Various index schemes have been proposed for top-k spatial-keyword queries; however, a unified framework covering all these schemes has not been proposed. In this paper, we present a generic model of index schemes for top-k spatial-keyword queries, which we call G-Index Model. First, G-Index Model is a unified framework that
more » ... ly investigates all the possible index schemes for top-k spatial-keyword queries. For this, we conjecture that data clustering is the key element in composing various index schemes and generate index schemes as combinations of clustering. The result shows that all the existing methods map to those generated by G-Index Model. Using G-Index Model, we also discover two new methods that have not been reported before. Second, we show that G-Index Model is generic, i.e., it can generate index schemes for a class of queries integrating arbitrary multiple data types. For this, we show that G-Index Model can enumerate index schemes for two classes of queries: the spatial-keyword query (without the top-k constraint) and the top-k spatial-keyword-relational query, which adds the relational data type to the top-k spatial-keyword query. Third, we propose a cost model of the generated methods for the topk spatial-keyword query. Consequently, the cost model allows us to do physical database design so as to find an optimal index scheme for a given usage pattern (i.e., a set of query loads and frequencies). We validate the cost model through extensive experiments.
doi:10.1007/s11280-014-0294-0 fatcat:xamnspocuff2rf4j2chuoktdym