A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Probabilistic Models for Query Approximation with Large Sparse Binary Datasets
[article]
2013
arXiv
pre-print
Large sparse sets of binary transaction data with millions of records and thousands of attributes occur in various domains: customers purchasing products, users visiting web pages, and documents containing words are just three typical examples. Real-time query selectivity estimation (the problem of estimating the number of rows in the data satisfying a given predicate) is an important practical problem for such databases. We investigate the application of probabilistic models to this problem.
arXiv:1301.3884v1
fatcat:v6kun4fi6nebxnfyawnve2wx7e