Subi Arumugam, Fei Xu, Ravi Jampani, Christopher Jermaine, Luis L. Perez, Peter J. Haas
2010 Proceedings of the VLDB Endowment  
Enterprises often need to assess and manage the risk arising from uncertainty in their data. Such uncertainty is typically modeled as a probability distribution over the uncertain data values, specified by means of a complex (often predictive) stochastic model. The probability distribution over data values leads to a probability distribution over database query results, and risk assessment amounts to exploration of the upper or lower tail of a query-result distribution. In this paper, we extend
more » ... the Monte Carlo Database System to efficiently obtain a set of samples from the tail of a query-result distribution by adapting recent "Gibbs cloning" ideas from the simulation literature to a database setting.
doi:10.14778/1920841.1920941 fatcat:2b5yjhb44fcgvp47heto47dc2u