An Efficient Indexing Approach for Continuous Spatial Approximate Keyword Queries over Geo-Textual Streaming Data
ISPRS International Journal of Geo-Information
Current social-network-based and location-based-service applications need to handle continuous spatial approximate keyword queries over geo-textual streaming data of high density. The continuous query is a well-known expensive operation. The optimization of continuous query processing is still an open issue. For geo-textual streaming data, the performance issue is more serious since both location information and textual description need to be matched for each incoming streaming data tuple. The
... tate-of-the-art continuous spatial-keyword query indexing approaches generally lack both support for approximate keyword matching and high-performance processing for geo-textual streaming data. Aiming to tackle this problem, this paper first proposes an indexing approach for efficient supporting of continuous spatial approximate keyword queries by integrating m i n - w i s e signatures into an AP-tree, namely AP-tree + . AP-tree + utilizes the one-permutation m i n - w i s e hashing method to achieve a much lower signature maintenance costs compared with the traditional m i n - w i s e hashing method because it only employs one hashing function instead of dozens. Towards providing a more efficient indexing approach, this paper has explored the feasibility of parallelizing AP-tree + by employing a Graphic Processing Unit (GPU). We mapped the AP-tree + data structure into the GPU's memory with a variety of one-dimensional arrays to form the GPU-aided AP-tree + . Furthermore, a m i n - w i s e parallel hashing algorithm with a scheme of data parallel and a GPU-CPU data communication method based on a four-stage pipeline way have been used to optimize the performance of the GPU-aided AP-tree + . The experimental results indicate that (1) AP-tree + can reduce the space cost by about 11% compared with MHR-tree, (2) AP-tree + can hold a comparable recall and 5.64× query performance gain compared with MHR-tree while saving 41.66% maintenance cost on average, (3) the GPU-aided AP-tree + can attain an average speedup of 5.76× compared to AP-tree + , and (4) the GPU-CPU data communication scheme can further improve the query performance of the GPU-aided AP-tree + by 39.4%.