Haofen Wang, Thanh Tran, Chang Liu
2008 Proceeding of the 17th ACM conference on Information and knowledge mining - CIKM '08  
The Web contains a large amount of documents and increasingly, also semantic data in the form of RDF triples. Many of these triples are annotations that are associated with documents. While structured query is the principal mean to retrieve semantic data, keyword queries are typically used for document retrieval. Clearly, a form of hybrid search that seamlessly integrates these formalisms to query both documents and semantic data can address more complex information needs. In this paper, we
more » ... ent CE 2 , an integrated solution that leverages mature database and information retrieval technologies to tackle challenges in hybrid search on the large scale. For scalable storage, CE 2 integrates database with inverted indices. Hybrid query processing is supported in CE 2 through novel algorithms and data structures, which allow for advanced ranking schemes to be integrated more tightly into the process. Experiments conducted on Dbpedia and Wikipedia show that CE 2 can provide good performance in terms of both effectiveness and efficiency.
doi:10.1145/1458082.1458258 dblp:conf/cikm/WangTL08 fatcat:qekb2hnnnzcq7jtjyoyebo3xme