Filters








6,279 Hits in 4.2 sec

Estimating join selectivities using bandwidth-optimized kernel density models

Martin Kiefer, Max Heimel, Sebastian Breß, Volker Markl
2017 Proceedings of the VLDB Endowment  
We evaluated our KDE-based join estimators on a variety of synthetic and real-world datasets, demonstrating that they are superior to state-of-the art join estimators based on sketching or sampling.  ...  Accurately predicting the cardinality of intermediate plan operations is an essential part of any modern relational query optimizer.  ...  Acknowledgment The work has received funding from the European Union's Horizon2020 Research & Innovation Program under grant agreement 671500 (project 'SAGE') and from the German Ministry for Education  ... 
doi:10.14778/3151106.3151112 fatcat:yygmcltujza5vpqogsqzukw3bq

Survey on Query Estimation in Data Streams

Sudhanshu Gupta, Deepak Garg
2009 2009 IEEE International Advance Computing Conference  
We begin in for query estimation.  ...  compete for available space with the database cache Query estimation plays an important role in query and query execution buffers.  ...  for the approximate answers to aggregate queries with number of distinct pairs in the join result.  ... 
doi:10.1109/iadcc.2009.4809224 fatcat:ptjvk56acjczbhhurdc2m2vt5m

End-biased Samples for Join Cardinality Estimation

C. Estan, J.F. Naughton
2006 22nd International Conference on Data Engineering (ICDE'06)  
the data is correlated.  ...  We present a new technique for using samples to estimate join cardinalities. This technique, which we term "end-biased samples," is inspired by recent work in network traffic measurement.  ...  Acknowledgements We thank Sumit Ganguly for the code implementing sketches that we used in our experiments. This work supported in part by NSF grant ITR 0086002.  ... 
doi:10.1109/icde.2006.61 dblp:conf/icde/EstanN06 fatcat:4q7mjlylyjhrvhvxw5kat7kndu

Querying and mining data streams

Minos Garofalakis, Johannes Gehrke, Rajeev Rastogi
2002 Proceedings of the 2002 ACM SIGMOD international conference on Management of data - SIGMOD '02  
over streaming data. § Processing Queries on Streams: Using sketches for self-joins, binary joins, and complex joins over data streams; estimating correlated aggregates; using histogram and wavelet synopses  ...  for approximate-query processing. § Mining High-speed Data Streams: Single-pass algorithms for association-rule discovery, clustering, and decision-tree construction over data streams. § Advanced Topics  ...  over streaming data. § Processing Queries on Streams: Using sketches for self-joins, binary joins, and complex joins over data streams; estimating correlated aggregates; using histogram and wavelet synopses  ... 
doi:10.1145/564691.564794 dblp:conf/sigmod/GarofalakisGR02 fatcat:cefwebamnfgnjmujwt6k5jsuwy

Querying and mining data streams

Minos Garofalakis, Johannes Gehrke, Rajeev Rastogi
2002 Proceedings of the 2002 ACM SIGMOD international conference on Management of data - SIGMOD '02  
over streaming data. § Processing Queries on Streams: Using sketches for self-joins, binary joins, and complex joins over data streams; estimating correlated aggregates; using histogram and wavelet synopses  ...  for approximate-query processing. § Mining High-speed Data Streams: Single-pass algorithms for association-rule discovery, clustering, and decision-tree construction over data streams. § Advanced Topics  ...  over streaming data. § Processing Queries on Streams: Using sketches for self-joins, binary joins, and complex joins over data streams; estimating correlated aggregates; using histogram and wavelet synopses  ... 
doi:10.1145/564793.564794 fatcat:ps7a7vvegndj7bmrhclmzl42ry

Join size estimation subject to filter conditions

David Vengerov, Andre Cavalheiro Menck, Mohamed Zait, Sunil P. Chakkappen
2015 Proceedings of the VLDB Endowment  
The proposed algorithm, Correlated Sampling, constructs a small space synopsis for each table, which can then be used to provide a quick estimate of the join size of this table with other tables subject  ...  In this paper, we present a new algorithm for estimating the size of equality join of multiple database tables.  ...  An approach to approximate query processing using nonstandard multi-dimensional wavelet decomposition has been presented in [2] , which can estimate join sizes subject to dynamically specified predicate  ... 
doi:10.14778/2824032.2824051 fatcat:uazef5jfsjcbzhdmnn6jvxte74

Guest editor introduction: special section on online analysis and querying of continuous data streams

R. Rastogi
2003 IEEE Transactions on Knowledge and Data Engineering  
The final paper by Ananthakrishna et al. considers the problem of approximately answering correlated-sum aggregate queries on a data stream.  ...  The authors show how generalized sample summaries (which are essentially a set of samples from the input data stream) can be used to approximate answers to the correlated-sum queries.  ... 
doi:10.1109/tkde.2003.1198386 fatcat:m3rhddwpvva3hm4rfrtvbo6o6e

Statistical analysis of sketch estimators

Florin Rusu, Alin Dobra
2007 Proceedings of the 2007 ACM SIGMOD international conference on Management of data - SIGMOD '07  
Sketching techniques can provide approximate answers to aggregate queries either for data-streaming or distributed computation.  ...  The prevalent method for analyzing sketches uses moment analysis and distribution independent bounds based on moments.  ...  by and that can provide good approximations for a wide spectrum of queries.  ... 
doi:10.1145/1247480.1247503 dblp:conf/sigmod/RusuD07 fatcat:abr5bowk2jfjff4os2gdvzfjbq

Approximate continuous querying over distributed streams

Graham Cormode, Minos Garofalakis
2008 ACM Transactions on Database Systems  
The end result is a powerful approximate query tracking framework that readily incorporates several complex analysis queries (including distributed join and multi-join aggregates, and approximate wavelet  ...  For instance, tracking the result size of a join (the "workhorse" correlation operator in the relational world) over the streams of fault/alarm data from two or more IP routers (e.g., with a join condition  ...  In the relational world, join and multi-join queries are basically the "workhorse" operations for correlating two or more data sets.  ... 
doi:10.1145/1366102.1366106 fatcat:v724jii3c5dbbl36gp6p3ctvnu

An improved data stream summary: the count-min sketch and its applications

Graham Cormode, S. Muthukrishnan
2005 Journal of Algorithms  
Our sketch allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly; in addition, it can be applied to solve several  ...  We introduce a new sublinear space data structure-the Count-Min Sketch-for summarizing data streams.  ...  It is an open problem to design extremely simple, practical sketches such as our CM Sketch for estimating such correlations and more complex data stream applications.  ... 
doi:10.1016/j.jalgor.2003.12.001 fatcat:n52ctjwn3ncsdl5n4xzj5tgrdq

An Improved Data Stream Summary: The Count-Min Sketch and Its Applications [chapter]

Graham Cormode, S. Muthukrishnan
2004 Lecture Notes in Computer Science  
Our sketch allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly; in addition, it can be applied to solve several  ...  We introduce a new sublinear space data structure-the Count-Min Sketch-for summarizing data streams.  ...  It is an open problem to design extremely simple, practical sketches such as our CM Sketch for estimating such correlations and more complex data stream applications.  ... 
doi:10.1007/978-3-540-24698-5_7 fatcat:dlavnb57wnceldrliwpp6ywroa

Multi-query optimization for sketch-based estimation

Alin Dobra, Minos Garofalakis, Johannes Gehrke, Rajeev Rastogi
2009 Information Systems  
Randomized techniques, based on computing small "sketch" synopses for each stream, have recently been shown to be a very effective tool for approximating the result of a single SQL query over streaming  ...  sketching space and the quality of the resulting approximation error guarantees.  ...  Approximating single-query answers with pseudorandom sketch summaries The basic technique: binary-join size tracking [8, 9] .  ... 
doi:10.1016/j.is.2008.06.002 fatcat:zvz4w2lz3restffiqchoaozdby

Simpli-Squared: A Very Simple Yet Unexpectedly Powerful Join Ordering Algorithm Without Cardinality Estimates [article]

Asoke Datta, Yesdaulet Izenov, Brian Tsan, Florin Rusu
2021 arXiv   pre-print
Based on these results, we question whether JOB adequately tests query optimizers or if accurate cardinality estimation is such a fundamental requirement for performing well on the JOB benchmark.  ...  The join order of a given query is computed by splitting the join graph along the many-to-many joins and sorting the tables based on their size.  ...  [1] introduce Pessimistic [1] , which uses Count-Min sketches for capturing join crossing correlations.  ... 
arXiv:2111.00163v1 fatcat:u7wsj2xx6jau7dxgxgqljfcapa

Parallel Streaming Implementation of Online Time Series Correlation Discovery on Sliding Windows with Regression Capabilities

Boyan Kolev, Reza Akbarinia, Ricardo Jimenez-Peris, Oleksandra Levchenko, Florent Masseglia, Marta Patino, Patrick Valduriez
2019 Proceedings of the 9th International Conference on Cloud Computing and Services Science  
This paper addresses the problem of continuously finding highly correlated pairs of time series over the most recent time window and possibly use the discovered correlations to select features for training  ...  a regression model for prediction.  ...  the window, represented by the provided timestamp, together with the Pearson correlation coefficient.  For the regression extensions, each of the extended output tuples shows the approximation of a specific  ... 
doi:10.5220/0007843806810687 dblp:conf/closer/KolevAJLMPV19 fatcat:bufizfkcvfbhpg4mxgi2cii44m

Sketch-Based Multi-Query Processing over Data Streams [chapter]

Alin Dobra, Minos Garofalakis, Johannes Gehrke, Rajeev Rastogi
2016 Data-Centric Systems and Applications  
Randomized techniques, based on computing small "sketch" synopses for each stream, have recently been shown to be a very effective tool for approximating the result of a single SQL query over streaming  ...  sketching space and the quality of the resulting approximation error guarantees.  ...  Stream Query-Processing Engine Stream for R 1 Stream for R 2 Stream for R r Sketches for R 1 Sketches for R r Memory Query Workload to queries Q 1 , . . . , Q q Approximate answers  ... 
doi:10.1007/978-3-540-28608-0_12 fatcat:pikysomhbbbhrhnqhzfbec7xy4
« Previous Showing results 1 — 15 out of 6,279 results