Top-k query processing in probabilistic databases with non-materialized views

M. Dylla, I. Miliaraki, M. Theobald
2013 2013 IEEE 29th International Conference on Data Engineering (ICDE)  
We investigate a novel approach of computing confidence bounds for top-k ranking queries in probabilistic databases with non-materialized views. Unlike related approaches, we present an exact pruning algorithm for finding the topranked query answers according to their marginal probabilities without the need to first materialize all answer candidates via the views. Specifically, we consider conjunctive queries over multiple levels of select-project-join views, the latter of which are cast into
more » ... talog rules which we ground in a top-down fashion directly at query processing time. To our knowledge, this work is the first to address integrated data and confidence computations for intensional query evaluations in the context of probabilistic databases by considering confidence bounds over first-order lineage formulas. We extend our query processing techniques by a tool-suite of scheduling strategies based on selectivity estimation and the expected impact on confidence bounds. Further extensions to our query processing strategies include improved top-k bounds in the case when sorted relations are available as input, as well as the consideration of recursive rules. Experiments with large datasets demonstrate significant runtime improvements of our approach compared to both exact and sampling-based top-k methods over probabilistic data.
doi:10.1109/icde.2013.6544819 dblp:conf/icde/DyllaMT13 fatcat:pw37hkukv5fnndzpoyeklbi4va