Efficient Top-k Keyword Search on XML Streams

Lingli Li, Hongzhi Wang, Jianzhong Li, Jizhou Luo
2008 2008 The 9th International Conference for Young Computer Scientists  
Keywords are suitable for query XML streams without schema information. In current forms of keywords search on XML streams and rank functions do not always represent users' intensions. This paper addresses this problem in another aspect. In this paper, the skyline Top-K keyword queries, a novel kind of keyword queries on XML streams, are presented. For such queries, skyline is used to choose results on XML streams without considering the complicated factors influencing the relevance to queries.
more » ... With skyline query processing techniques, two techniques, are presented to process skyline Top-K keyword single queries and multi-queries on XML streams efficiently. Extensive experiments are performed to verify the effectiveness and efficiency of these techniques presented in this paper. According to the experimental results, the algorithms are not sensitive to the parameters such as the number of keywords, the number of results, the number of queries, and the runtime is approximately linear to the size of document. 们用下面的例子来说明这一点 . 例如,考虑一个在 XML 片段上的关键字查询 Q={Bob,database,engine},如图 1 所示.对相同的查询 Q,用户 A,B 和 C 有不同的查询需求:用户 A 想要查询 Bob 参与开发的有关 database 或 engine 的工程;用户 B 想要查询 由 Bob 开发的关于 engine database 的工程;用户 C 想要查询由 Bob 开发的关于 database engine 的工程.如图 2 所示,查询返回了 3 个结果 result 1 ,result 2 和 result 3 .从 3 个查询结果中可以看出,对用户 A 来说,result 2 和 result 3 是与查询相关的,而 result 1 不是;对用户 B 来说,result 2 是相关的,而 result 1 和 result 3 不是;对用户 C 来说,result 3 是相关的,而 result 1 和 result 2 不是.
doi:10.1109/icycs.2008.28 dblp:conf/icycs/LiWLL08 fatcat:4oubob55x5gsloz337rppug4na