A graph method for keyword-based selection of the top-K databases

Quang Hieu Vu, Beng Chin Ooi, Dimitris Papadias, Anthony K. H. Tung
2008 Proceedings of the 2008 ACM SIGMOD international conference on Management of data - SIGMOD '08  
While database management systems offer a comprehensive solution to data storage, they require deep knowledge of the schema, as well as the data manipulation language, in order to perform effective retrieval. Since these requirements pose a problem to lay or occasional users, several methods incorporate keyword search (KS) into relational databases. However, most of the existing techniques focus on querying a single DBMS. On the other hand, the proliferation of distributed databases in several
more » ... onventional and emerging applications necessitates the support for keyword-based data sharing and querying over multiple DMBSs. In order to avoid the high cost of searching in numerous, potentially irrelevant, databases in such systems, we propose G-KS, a novel method for selecting the top-K candidates based on their potential to contain results for a given query. G-KS summarizes each database by a keyword relationship graph, where nodes represent terms and edges describe relationships between them. Keyword relationship graphs are utilized for computing the similarity between each database and a KS query, so that, during query processing, only the most promising databases are searched. An extensive experimental evaluation demonstrates that G-KS outperforms the current state-of-the-art technique on all aspects, including precision, recall, efficiency, space overhead and flexibility of accommodating different semantics.
doi:10.1145/1376616.1376707 dblp:conf/sigmod/VuOPT08 fatcat:6bw7vpke4jgwbll6f4fzcznqy4