Filters








12,710 Hits in 3.6 sec

Approximate quantiles and the order of the stream

Sudipto Guha, Andrew McGregor
2006 Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems - PODS '06  
Using quantiles as an example application, we show that we can design provably better algorithms, and settle several open questions on the impact of order on streams.  ...  In this paper, we investigate the importance of the ordering of a data stream, without making any assumptions about the actual distribution of the data.  ...  ACKNOWLEDGEMENTS We thank Anupam Gupta for suggesting that we consider the case when k is o(n).  ... 
doi:10.1145/1142351.1142390 dblp:conf/pods/GuhaM06 fatcat:d77lcllnurflnhf643hdojtxm4

Quantiles on Streams [chapter]

Michael Vassilakopoulos, Theodoros tzouramanis, Paolo Terenziani, Chintan Patel, Chunhua Weng, Rafael Romero, Jose-Norberto Mazón, Juan Trujillo, Manuel Serrano, Mario Piattini, Chiranjeeb Buragohain, Subhash Suri (+30 others)
2009 Encyclopedia of Database Systems  
This article describes data stream (single-pass) algorithms for computing an approximation of such quantiles.  ...  SYNONYMS Median; histogram; selection; order statistics DEFINITION Quantiles are order statistics of data: the φ-quantile (0 ≤ φ ≤ 1) of a set S is an element x such that φ|S| elements of S are less than  ...  Thus, the algorithm is restricted to a single scan of the data in the input order, and after this scan it must output an approximation of the quantiles of the input values.  ... 
doi:10.1007/978-0-387-39940-9_290 fatcat:qbico4tgajaprhnpmysezn3nni

Quantiles on Streams [chapter]

Chiranjeeb Buragohain, Subhash Suri
2016 Encyclopedia of Database Systems  
This article describes data stream (single-pass) algorithms for computing an approximation of such quantiles.  ...  SYNONYMS Median; histogram; selection; order statistics DEFINITION Quantiles are order statistics of data: the φ-quantile (0 ≤ φ ≤ 1) of a set S is an element x such that φ|S| elements of S are less than  ...  Thus, the algorithm is restricted to a single scan of the data in the input order, and after this scan it must output an approximation of the quantiles of the input values.  ... 
doi:10.1007/978-1-4899-7993-3_290-2 fatcat:gawdpiatfneyvdpyn23kdalpdq

A Survey of Approximate Quantile Computation on Large-scale Data

Zhiwei Chen, Aoqian Zhang
2020 IEEE Access  
Then, multiple techniques for improving the efficiency and performance of approximate quantile algorithms in various scenarios, such as skewed data and high-speed data streams, are presented.  ...  In this paper, we focus on an order statistic, quantiles, and present a comprehensive analysis of studies on approximate quantile computation.  ...  ACKNOWLEDGEMENT This work is supported in part by the National Key Research and Development Plan (2019YFB1705301) and the National Natural Science Foundation of China (61572272, 71690231).  ... 
doi:10.1109/access.2020.2974919 fatcat:rdi5xlombjfylpqazpribwfqri

A Fast Algorithm for Approximate Quantiles in High Speed Data Streams

Qi Zhang, Wei Wang
2007 International Conference on Scientific and Statistical Database Management  
We present a fast algorithm for computing approximate quantiles in high speed data streams with deterministic error bounds.  ...  In order to achieve high speed performance, the algorithm uses simple block-wise merge and sample operations.  ...  Greenwald for providing the optimized GK01 code and many useful suggestions.  ... 
doi:10.1109/ssdbm.2007.27 dblp:conf/ssdbm/ZhangW07 fatcat:77wejd2jwnhdfbou2j6qhrjetm

Frugal Streaming for Estimating Quantiles:One (or two) memory suffices [article]

Qiang Ma, S. Muthukrishnan, Mark Sandler
2014 arXiv   pre-print
For stochastic streams where data items are drawn from a distribution independently, we analyze and show that the algorithm finds an approximation to the quantile rapidly and remains stably close to it  ...  Modern applications require processing streams of data for estimating statistical quantities such as quantiles with small amount of memory.  ...  Acknowledgements This work was sponsored by the NSF Grant 1161151: AF: Sparse Approximation: Theory and Extensions.  ... 
arXiv:1407.1121v1 fatcat:lchb6d3oczblzav7nv2jztaqkm

Space Efficient Quantile Summary for Constrained Sliding Windows on a Data Stream [chapter]

Jian Xu, Xuemin Lin, Xiaofang Zhou
2004 Lecture Notes in Computer Science  
Our algorithm makes one pass on the data stream and maintains an -approximate summary. It uses O( 1 2 log 2 N ) space where N is the number of data items in the window.  ...  In this paper, we study the problem of estimating quantiles over other types of sliding windows.  ...  The work [7, 8] studied the problem of estimating approximate quantiles for whole data stream. The space requirement is O( 1 log 2 N ) for -approximate quantiles.  ... 
doi:10.1007/978-3-540-27772-9_5 fatcat:yz3ygco7mvcovd5cmpa26giqua

Bounded Space Differentially Private Quantiles [article]

Daniel Alabi, Omri Ben-Eliezer, Anamay Chaturvedi
2022 arXiv   pre-print
Estimating the quantiles of a large dataset is a fundamental problem in both the streaming algorithms literature and the differential privacy literature.  ...  Our basic mechanism estimates any α-approximate quantile of a length-n stream over a data universe 𝒳 with probability 1-β using O( log (|𝒳|/β) log (αϵ n)/αϵ) space while satisfying ϵ-differential privacy  ...  RELATED WORK 2.1 Quantile Approximation of Streams and Sketches Approximation of quantiles in large data streams (without privacy guarantees) is among the most well-investigated problems in the streaming  ... 
arXiv:2201.03380v1 fatcat:jzmpfl347vepthxkahldmibiu4

A Novel Incremental Quantile Estimator Using the Magnitude of the Observations

Hugo Lewi Hammer, Anis Yazidi
2018 2018 26th Mediterranean Conference on Control and Automation (MED)  
The estimators merely relying on the sign of the difference between the quantile estimate and the current observation which seems like a waste of information from the data stream.  ...  Incremental quantile estimators like the the deterministic multiplicative incremental quantile estimator by Yazidi and Hammer (2017) are simple and efficient algorithms to estimate and track quantiles  ...  The approximation of the quantiles relies on using linear and parabolic interpolations, while the tails of the distribution are approximated using exponential curves.  ... 
doi:10.1109/med.2018.8443071 dblp:conf/med/HammerY18a fatcat:et6o6d3r7zdaziu5nc4326a74a

Tight Lower Bound for Comparison-Based Quantile Summaries [article]

Graham Cormode, Pavel Veselý
2020 arXiv   pre-print
That is, an ε-approximate quantile summary first processes a stream of items and then, given any quantile query 0<ϕ< 1, returns an item from the stream, which is a ϕ'-quantile for some ϕ' = ϕ±ε.  ...  Quantiles, such as the median or percentiles, provide concise and useful information about the distribution of a collection of items, drawn from a totally ordered universe.  ...  The work is supported by European Research Council grant ERC-2014-CoG 647557.  ... 
arXiv:1905.03838v2 fatcat:5zddf3bpwvelnptmtets4gwa5q

Evaluation of Summarization Schemes for Learning in Streams [chapter]

Alec Pawling, Nitesh V. Chawla, Amitabh Chaudhary
2006 Lecture Notes in Computer Science  
We present a time-and-memory-efficient discretization technique based on computing ε-approximate exponential frequency quantiles, and prove bounds on the worst-case error introduced in computing information  ...  We compare the empirical performance of the technique, using it for feature selection, with (streaming adaptations of) two popular methods of discretization, equal width binning and equal frequency binning  ...  . , (x (n) , y (n) ) is the stream of n ordered pairs, each consisting of a feature vector x (i) and a class label y (i) .  ... 
doi:10.1007/11871637_34 fatcat:sgoibqvuobfcbaur3emgnw3u5q

Space-efficient estimation of empirical tail dependence coefficients for bivariate data streams [article]

Alastair Gregory, Kaushik Jana
2019 arXiv   pre-print
Modifications to the space-efficient bivariate copula approximation, presented in this paper, allow the error of approximations to the tail dependence coefficients to remain stream-length invariant.  ...  The approximation, which has stream-length invariant error bounds, utilises recent work on the development of a summary for bivariate empirical copula functions.  ...  Let ∈ [1, ], then we know that using the quantile summary we keep an approximation,̃ ( ) , to the 'th order statistic of the data stream; this approximation satisfies̃ ( ) ≤̃ ( ) ≤ <̃ +1 ( ) ≤̃ +1 ( )  ... 
arXiv:1902.03586v3 fatcat:7hzeijr6grexlk6sx6js6rigcy

A Tight Lower Bound for Comparison-Based Quantile Summaries

Graham Cormode, Pavel Veselý
2020 Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems  
That is, an ε-approximate quantile summary first processes a stream and then, given any quantile query 0 ≤ ϕ ≤ 1, returns an item from the stream, which is a ϕ ′ -quantile for some ϕ ′ = ϕ ± ε.  ...  Quantiles, such as the median or percentiles, provide concise and useful information about the distribution of a collection of items, drawn from a totally ordered universe.  ...  The work is supported by European Research Council grant ERC-2014-CoG 647557.  ... 
doi:10.1145/3375395.3387650 dblp:conf/pods/CormodeV20 fatcat:uoqzl6vkknb2jpyfqtmomgihii

Online Computing Quantile Summaries Over Uncertain Data Streams

Chunquan Liang, Mei Li, Bin Liu
2019 IEEE Access  
Quantile summarization is a useful tool in data streams management and mining that can efficiently capture the distribution of the data.  ...  For an answer to a quantile query on uncertain data, we give the methods for calculating the value of the error and thereby discussing the high-level features of the summaries that can answer approximate  ...  Definition 1 (Approximate Quantile Over Deterministic Data): A φ−quantile on an ordered sequence of points S of size n is the point with rank r = φn ; given a rank r, a quantile query is -approximate if  ... 
doi:10.1109/access.2019.2891550 fatcat:ifb5xezd2bcxfk5y7wcd5ifbii

Frugal Streaming for Estimating Quantiles [chapter]

Qiang Ma, S. Muthukrishnan, Mark Sandler
2013 Lecture Notes in Computer Science  
For stochastic streams where data items are drawn from a distribution independently, we analyze and show that the algorithm finds an approximation to the quantile rapidly and remains stably close to it  ...  Modern applications require processing streams of data for estimating statistical quantities such as quantiles with small amount of memory.  ...  Stream-quantile curve shows the cumulative stream quantile, and this is the curve which the other algorithms try to approximate if the combined stream is of interest at the beginning.  ... 
doi:10.1007/978-3-642-40273-9_7 fatcat:zucsmbeigrcptops2xykmazzxa
« Previous Showing results 1 — 15 out of 12,710 results