Filters








338 Hits in 5.8 sec

Fast and accurate computation of equi-depth histograms over data streams

Hamid Mousavi, Carlo Zaniolo
2011 Proceedings of the 14th International Conference on Extending Database Technology - EDBT/ICDT '11  
In this paper, we present a new algorithm to estimate equi-depth histograms for high speed data streams over sliding windows.  ...  Equi-depth histograms represent a fundamental synopsis widely used in both database and data stream applications, as they provide the cornerstone of many techniques such as query optimization, approximate  ...  The authors would like to thank Armita Azari and Deirdre Kerr for their valuable help in gathering the data sets for the experimental results and also proof reading the paper.  ... 
doi:10.1145/1951365.1951376 dblp:conf/edbt/MousaviZ11 fatcat:qwlwglnrsvfnhkyz3rb26k2rwq

Efficient representation of distributions for background subtraction

Yedid Hoshen, Chetan Arora, Yair Poleg, Shmuel Peleg
2013 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance  
Online computation of such histograms is described, and examples are given for background subtraction.  ...  Accurate representation of such distributions, e.g. in a histogram, requires much memory that may not be available when a histogram is computed for each pixel.  ...  However, the online calculation of an equi-depth histogram is not trivial without storing all data values.  ... 
doi:10.1109/avss.2013.6636652 dblp:conf/avss/HoshenAPP13 fatcat:cwk2dizfhngplpvm2uvik33in4

Approximate Query Processing: Taming the TeraBytes

Minos N. Garofalakis, Phillip B. Gibbons
2001 Very Large Data Bases Conference  
2; for more equal buckets, can recompute from the sample) buckets for largest values, equi-depth over the rest • Improvement over equi-depth since get exact info on largest values, e.g., join estimation  ...  ), N= size of domain • Empirical results over synthetic data -Improvements over random sampling and histograms (MaxDiff) -wavelet synopses on the original data distribution -Similar accuracy with CDF,  ...  -Studies the effectiveness of histograms, kernel-density estimators, and their hybrids for estimating the selectivity of range queries over metric attributes with large domains. • -Precursor to [CDN01]  ... 
dblp:conf/vldb/GarofalakisG01 fatcat:ckt4oz7y25f2jmzm5af3jbwax4

Fast computation of approximate biased histograms on sliding windows over data streams

Hamid Mousavi, Carlo Zaniolo
2013 Proceedings of the 25th International Conference on Scientific and Statistical Database Management - SSDBM  
Moreover, very fast approximate algorithms are needed to compute accurate histograms on fast-arriving data streams, whereby online queries can be supported within the given memory and computing resources  ...  In this paper, we define biased histograms over data streams and sliding windows on data streams, and propose the Bar Splitting Biased Histogram (BSBH) algorithm to construct them efficiently and accurately  ...  BASH provides a very fast and memory-efficient equi-depth histogram particularly for high-speed data streams.  ... 
doi:10.1145/2484838.2484851 dblp:conf/ssdbm/MousaviZ13 fatcat:kw7tu4vrwfgnjfdgze2id5pw2a

Constructing fading histograms from data streams

Raquel Sebastião, João Gama, Teresa Mendonça
2014 Progress in Artificial Intelligence  
When constructing online histograms from data streams there are two main characteristics to embrace: the updating facility and the error of the histogram.  ...  Reducing memory occupancy is of utmost importance when handling a huge amount of data. This paper addresses the problem of constructing histograms from data streams under error constraints.  ...  This work was also funded by the European Regional Development Fund through the COMPETE Program, by the Portuguese Funds through the FCT (Portuguese Foundation for Science and Technology) within project  ... 
doi:10.1007/s13748-014-0050-9 fatcat:63xwdgku45ejrajvdk26kld7ce

Histograms as a side effect of data movement for big data

Zsolt Istvan, Louis Woods, Gustavo Alonso
2014 Proceedings of the 2014 ACM SIGMOD international conference on Management of data - SIGMOD '14  
Histograms are a crucial part of database query planning but their computation is resource-intensive.  ...  Moreover, the FPGA can provide various types of histograms such as Equidepth, Compressed, or Max-diff on the same input data in parallel, without additional overhead.  ...  Acknowledgements This work is funded in part by grants from Xilinx, as part of the Enterprise Computing Center (www.ecc.ethz.ch), and Microsoft Research, as part of the Joint Research Center MSR-ETHZ-EPFL  ... 
doi:10.1145/2588555.2612174 dblp:conf/sigmod/IstvanWA14 fatcat:hkiti7hllfh7dgqtaguww7kgya

UDDSketch: Accurate Tracking of Quantiles in Data Streams

Italo Epicoco, Catiuscia Melle, Massimo Cafaro, Marco Pulimeno, Giuseppe Morleo
2020 IEEE Access  
We present UDDSketch (Uniform DDSketch), a novel sketch for fast and accurate tracking of quantiles in data streams.  ...  systems, and are fundamental for statistical data analysis.  ...  [6] presented two fast and efficient procedures for maintaining two classes of histograms: equi-depth histograms and compressed histograms.  ... 
doi:10.1109/access.2020.3015599 fatcat:d77dz7u3vfbnlaq7t7nk5p6gvm

UDDSketch: Accurate Tracking of Quantiles in Data Streams [article]

Italo Epicoco, Catiuscia Melle, Massimo Cafaro, Marco Pulimeno, Giuseppe Morleo
2020 arXiv   pre-print
We present UDDSketch (Uniform DDSketch), a novel sketch for fast and accurate tracking of quantiles in data streams.  ...  On the contrary, UDDSketch is designed so that accuracy guarantees can be given over the full range of quantiles and for arbitrary distribution in input.  ...  Gibbons et. al [5] presented two fast and efficient procedures for maintaining two classes of histogram: equi-depth histograms and compressed histograms.  ... 
arXiv:2004.08604v1 fatcat:mr2qxbvzvnf3zkwinj4kkhawiy

Reducing Data Stream Sliding Windows by Cyclic Tree-Like Histograms [chapter]

Francesco Buccafurri, Gianluca Lax
2004 Lecture Notes in Computer Science  
When mining is applied to data streams, that are continuous data flows, the issue of suitably reducing them is highly interesting, in order to arrange effective approaches requiring multiple scans on data  ...  The histogram, based on a hierarchical structure (opposed to the flat structure of traditional ones), results suitable for directly supporting hierarchical queries, and, thus, drill-down and roll-up operations  ...  Also this comparison shows the superiority of the c-tree over other histogram methods. In Figure 4 .(a) and 4.  ... 
doi:10.1007/978-3-540-30116-5_10 fatcat:wlphmxwn2zdurdlcl4g5llwyve

Mining Techniques for Streaming Data

ManaL Mansour, Manal Abdullah
2022 International Journal of Data Mining & Knowledge Management Process  
It reviews the methods for data stream summarizing and creating synopsis, and the approaches of processing these data synopses.  ...  The goal is to present a model for mining the streaming data which describes the main phases of data stream manipulation.  ...  Compressed Histograms This type works by dividing the data points with high frequencies in singleton buckets while the rest data is divided as equi-depth histogram.  ... 
doi:10.5121/ijdkp.2022.12201 fatcat:nrnwgjf2j5bztnqa2g7t5ocmey

Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches

Graham Cormode
2011 Foundations and Trends in Databases  
They are especially appropriate for streaming data, in which large quantities of data flow by and the sketch summary must Mathematical Essentials of Sampling 29 Chebyshev Bounds.  ...  These methods proceed by computing a lossy, compact synopsis of the data, and then executing the query of interest against the synopsis rather than the entire dataset.  ...  Acknowledgments The work of Minos Garofalakis was partially supported by the European Commission under FP7-FET Open (Future and Emerging Technologies) ICT-2009.8.0 grant no. 255957 (LIFT).  ... 
doi:10.1561/1900000004 fatcat:wk7razxkmzcv7fzczftlohblwa

RHist

Lin Qiao, Divyakant Agrawal, Amr El Abbadi
2002 Proceedings of the eleventh international conference on Information and knowledge management - CIKM '02  
Maintaining approximate aggregates and summaries over data streams is crucial to handle the OLAP query workload that arises in applications, such as network monitoring and telecommunications.  ...  We show that R(elaxed)Hist(ogram) is an appropriate summarization under data stream scenario.  ...  In this experiment, we subject the histogram, first to a data stream, and then to a series of queries. MaxDiff, V-Optimal and Equi-depth histograms are built at the end of the data stream,.  ... 
doi:10.1145/584792.584870 dblp:conf/cikm/QiaoAA02 fatcat:b4g7ssosnval7eniviz3aomaae

RHist

Lin Qiao, Divyakant Agrawal, Amr El Abbadi
2002 Proceedings of the eleventh international conference on Information and knowledge management - CIKM '02  
Maintaining approximate aggregates and summaries over data streams is crucial to handle the OLAP query workload that arises in applications, such as network monitoring and telecommunications.  ...  We show that R(elaxed)Hist(ogram) is an appropriate summarization under data stream scenario.  ...  In this experiment, we subject the histogram, first to a data stream, and then to a series of queries. MaxDiff, V-Optimal and Equi-depth histograms are built at the end of the data stream,.  ... 
doi:10.1145/584869.584870 fatcat:r75c3jysqfbpnhxif7skydlz7u

Delivering QOS in XML Data Stream Processing Using Load Shedding

Ranjan Dash
2012 International Journal of Database Management Systems  
Data Stream Management Systems (DSMS) are fast emerging to address this new type of data, but faces challenging issues, such as unpredictable data arrival rate.  ...  The system overloading is even more acute in XML data streams compared to relational streams due to its extra resource requirements for data preparation and result construction.  ...  Histograms have been widely used in data stream systems in many ways such as equi-width, equi-depth and V-Optimal histograms.  ... 
doi:10.5121/ijdms.2012.4304 fatcat:tpcporrjevdyzpcsv4kkhea6gq

Loda: Lightweight on-line detector of anomalies

Tomáš Pevný
2015 Machine Learning  
Besides being fast and accurate, Loda is also able to operate and update itself on data with missing variables. Loda is thus practical in domains with sensor outages.  ...  We compare Loda to several state of the art anomaly detectors in two settings: batch training and on-line training on data streams.  ...  use this fine histogram to return either equi-width or equi-depth histogram with a given number of bins.  ... 
doi:10.1007/s10994-015-5521-0 fatcat:dqcmr7ovzjg4dp7tvbiv3vp7ki
« Previous Showing results 1 — 15 out of 338 results