Representing Tuple and Attribute Uncertainty in Probabilistic Databases

Prithviraj Sen, Amol Deshpande, Lise Getoor
2007 Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007)  
There has been a recent surge in work in probabilistic databases, propelled in large part by the huge increase in noisy data sources -sensor data, experimental data, data from uncurated sources, and many others. There is a growing need to be able to flexibly represent the uncertainties in the data, and to efficiently query the data. Building on existing probabilistic database work, we present a unifying framework which allows a flexible representation of correlated tuple and attribute level
more » ... rtainties. An important capability of our representation is the ability to represent shared correlation structures in the data. We provide motivating examples to illustrate when such shared correlation structures are likely to exist. Representing shared correlations structures allows the use of sophisticated inference techniques based on lifted probabilistic inference that, in turn, allows us to achieve significant speedups while computing probabilities for results of user-submitted queries.
doi:10.1109/icdmw.2007.11 dblp:conf/icdm/SenDG07 fatcat:dmxy2luohzdsrjo5ycsbeq6k4y