Filters








8,830 Hits in 5.5 sec

Online Aggregation based Approximate Query Processing: A Literature Survey [article]

Pritom Saha Akash, Wei-Cheng Lai, Po-Wen Lin
2022 arXiv   pre-print
Online aggregation-based AQP progressively generates approximate results with some error estimates (i.e., confidence interval) until the processing of all data is done.  ...  Lastly, we discuss some research challenges and opportunities for further advancing online aggregation research.  ...  According to [16] , the confidence interval in closed-form can be categorized into two categories: conservative confidence interval and large-sample confidence interval. (1) Conservative Confidence Interval  ... 
arXiv:2204.07125v1 fatcat:tdey5uh3szcjhbc3pskwcbf2wa

DAQ

Navneet Potti, Jignesh M. Patel
2015 Proceedings of the VLDB Endowment  
Our prototype scheme delivers speedups over exact aggregation and predicate evaluation, and outperforms sampling-based schemes for extreme value aggregations.  ...  A common response to this challenge is approximate query processing, where the user is presented with a quick confidence interval estimate based on a sample of the data.  ...  [6] show how the confidence interval estimates can be derived in online aggregation.  ... 
doi:10.14778/2777598.2777599 fatcat:qkydqmw7vvephgufxoo7ilanc4

Online aggregation

Joseph M. Hellerstein, Peter J. Haas, Helen J. Wang
1997 Proceedings of the 1997 ACM SIGMOD international conference on Management of data - SIGMOD '97  
methods for returning the output in random order, for providing control over the relative rate at which different aggregates are computed, and for computing running confidence intervals.  ...  The confidence and Interval fields give a probabilistic estimate of the proximity of the current running aggregate to the final resultaccording to Figure 1 , for example, the current average is within  ...  Appendix: Formulas for Running Confidence Intervals In this appendix we provide formulas that can be used to compute conservative and large-sample confidence intervals for a variety of aggregatbn queries  ... 
doi:10.1145/253260.253291 dblp:conf/sigmod/HellersteinHW97 fatcat:7cxsxzvsvff3vpqso4pw3jutty

You can stop early with COLA

Yingjie Shi, Xiaofeng Meng, Fusheng Wang, Yantao Gan
2012 Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM '12  
We formulate a statistical foundation that supports block-level sampling for single-table online aggregations and effective estimation of approximate results and confidence intervals of statistical significance  ...  We also develop a two-phase stratified sampling method to support multi-table aggregations to improve the approximate query answers and speed up the convergence of confidence intervals.  ...  The work in [10] improves the approach in [14] by providing the large-sample and deterministic confidence interval computing methods in the case of single-table and multitable queries.  ... 
doi:10.1145/2396761.2398423 dblp:conf/cikm/ShiMWG12 fatcat:guycck6q6vhblptkpr3otweysu

Ripple joins for online aggregation

Peter J. Haas, Joseph M. Hellerstein
1999 Proceedings of the 1999 ACM SIGMOD international conference on Management of data - SIGMOD '99  
confidence-interval length decreases at each update.  ...  We present a new family of join algorithms, called ripple joins, for online processing of multi-table aggregation queries in a relational database management system (DBMS).  ...  Computing and network resources for this research were provided through NSF RI grant CDA-9401156.  ... 
doi:10.1145/304182.304208 dblp:conf/sigmod/HaasH99 fatcat:y6lf25wvzvapfkchkoz5o3pbf4

Ripple joins for online aggregation

Peter J. Haas, Joseph M. Hellerstein
1999 SIGMOD record  
confidence-interval length decreases at each update.  ...  We present a new family of join algorithms, called ripple joins, for online processing of multi-table aggregation queries in a relational database management system (DBMS).  ...  Computing and network resources for this research were provided through NSF RI grant CDA-9401156.  ... 
doi:10.1145/304181.304208 fatcat:ql3scvzcr5cldpnxt7dlclmrau

An Efficient Block Sampling Strategy for Online Aggregation in the Cloud [chapter]

Xiang Ci, Xiaofeng Meng
2015 Lecture Notes in Computer Science  
One of the most commonly used approaches is online aggregation. Online aggregation responds aggregation queries against the random samples and refines the result as more samples are received.  ...  As a result, answers of online aggregation based on uniform random sampling can result in poor accuracy for groups with very few tuples.  ...  The work in [2] improves the approach in [1] by providing the large-sample and deterministic confidence interval computing methods in the case of single-table and multi-table queries.  ... 
doi:10.1007/978-3-319-21042-1_29 fatcat:k7niv3vpqrf4npk7vfxnhrdjiy

Approximate Query Processing: What is New and Where to Go?

Kaiyu Li, Guoliang Li
2018 Data Science and Engineering  
Existing AQP techniques can be broadly categorized into two categories. (1) Online aggregation: select samples online and use these samples to answer OLAP queries. (2) Offline synopses generation: generate  ...  However, it is rather costly to support OLAP on large datasets, especially big data, and the methods that compute exact answers cannot meet the high-performance requirement.  ...  Error Estimation The confidence interval is widely used to estimate the result quality in most of the random-sampling methods [2] , where each confidence interval gives users a numerical interval and  ... 
doi:10.1007/s41019-018-0074-4 fatcat:nhkxz345sfhw7byebh5b7hv3my

Interactive data analysis: the Control project

J.M. Hellerstein, R. Avnur, A. Chou, C. Hidber, C. Olston, V. Raman, T. Roth, P.J. Haas
1999 Computer  
such as support and confidence for association rule mining, thresholds for clustering, training sets for classification, and so on.  ...  BATCH VERSUS ONLINE PROCESSING Traditional analysis tools have a black-box interface: The user issues queries, the system processes silently for a significant period, and then the system returns an exact  ...  For example, parallel ripple joins involve stratified sampling techniques, which affect online aggregation estimators and confidence intervals.  ... 
doi:10.1109/2.781635 fatcat:w2e7t3wlbzguzm43edxni7ccoq

FlashP: An Analytical Pipeline for Real-time Forecasting of Time-Series Relational Data [article]

Shuyuan Yan, Bolin Ding, Wei Guo, Jingren Zhou, Zhewei Wei, Xiaowei Jiang, Sheng Xu
2021 arXiv   pre-print
We introduce a new sampling scheme, called GSW sampling, and analyze error bounds for estimating aggregations using GSW samples.  ...  affect the fitting of forecasting models, and forecasting results; and second, accordingly, what sampling algorithms we should use to obtain these approximate aggregations and how large the samples are  ...  For a fixed confidence level, the narrower the forecast intervals are, the more confident we are about the prediction.  ... 
arXiv:2101.03298v2 fatcat:uto2r2ws3bbmllvyej7m7qeava

A Probabilistic Estimation of PV Capacity in Distribution Networks from Aggregated Net-load Data

Lewis Waswa, Munyaradzi Justice Chihota, Bernard Bekker
2021 IEEE Access  
The results indicate that the method performs well at lower risk levels, which is expected as the confidence interval is large.  ...  capacity factors in time interval t and aggregated for the whole network.  ... 
doi:10.1109/access.2021.3119467 fatcat:zivfvep5rnarbkmpzd7fpkh42y

Online Building Load Management Control with Plugged-in Electric Vehicles Considering Uncertainties

Moses Amoasi Acquah, Sekyung Han
2019 Energies  
This study presents an online density demand forecast, k-means clustering of PEV groups and stochastic optimisation for robust operation of BESS and PEV for a building.  ...  Most of these solutions resort to deterministic load forecast for the day ahead energy scheduling and do not consider the uncertainties in demand and DES making these solutions vulnerable to uncertainties  ...  95% confidence interval for the forecast.  ... 
doi:10.3390/en12081436 fatcat:jejrdhj5mvgylih66ckjhopms4

Fractional process as a unified model for subdiffusive dynamics in experimental data

Krzysztof Burnecki, Grzegorz Sikora, Aleksander Weron
2012 Physical Review E  
time FARIMA(1,d,1) model is applied in this paper to the random motion of an individual fluorescently labeled mRNA molecule inside live E. coli cells in the experiment described in detail by Golding and  ...  ACKNOWLEDGMENTS The authors would like to thank Ido Golding for providing mRNA data and Eldad Kesten for providing telomere data.  ...  online) Sample MSD (blue circles) for the y coordinate of the trajectory no. 4 with Lévy stable noise with α = 1.81 and estimated 95% confidence intervals obtained via Monte Carlo simulations under the  ... 
doi:10.1103/physreve.86.041912 pmid:23214620 fatcat:nmz7qlcgyjflxbr5gnsc5rssjq

Novel assessment of numerical forecasting model relative humidity with satellite probabilistic estimates

Chloé Radice, Hélène Brogniez, Pierre-Emmanuel Kirstetter, Philippe Chambon
2022 Atmospheric Chemistry and Physics  
The probabilistic comparison is discussed with respect to a classical deterministic comparison confronting each model RH value to the reference average and using a set confidence interval.  ...  The probabilistic comparison allows for a more contrasted assessment than the deterministic one.  ...  We thank the CNES for its financial support through the Megha-Tropiques project and the national AERIS data center, which hosts the satellite data.  ... 
doi:10.5194/acp-22-3811-2022 fatcat:wooeqyxuurdshl5oylnvsy4rha

Spacecraft Reliability-Based Design Optimization Under Uncertainty Including Discrete Variables

Rania Hassan, William Crossley
2008 Journal of Spacecraft and Rockets  
Probabilistic approaches provide interval estimates and the probabilities (also known as confidence levels) that system performance estimates lie within these intervals.  ...  Fora given number of samples, the accuracy of prediction increases (the interval width is decreased) when the selected confidence level is decreased.  ... 
doi:10.2514/1.28827 fatcat:7amfvdwtyzdvjajhruzkvzxrjq
« Previous Showing results 1 — 15 out of 8,830 results