Range queries in OLAP data cubes

Ching-Tien Ho, Rakesh Agrawal, Nimrod Megiddo, Ramakrishnan Srikant
1997 SIGMOD record  
A range query applies an aggregation operation over all selected cells of an OLAP data cube where the selection is speci ed by providing ranges of values for numeric dimensions. We present fast algorithms for range queries for two types of aggregation operations: SUM and MAX. These two operations cover techniques required for most popular aggregation operations, such as those supported by SQL. For range-sum queries, the essential idea is to precompute some auxiliary information pre x sums that
more » ... s used to answer ad hoc queries at run-time. By maintaining auxiliary information which is of the same size as the data cube, all range queries for a given cube can be answered in constant time, irrespective of the size of the sub-cube circumscribed by a query. Alternatively, one can keep auxiliary information which i s 1 b d of the size of the d-dimensional data cube. Response to a range query may n o w require access to some cells of the data cube in addition to the access to the auxiliary information, but the overall time complexity i s t ypically reduced signi cantly. W e also discuss how the precomputed information is incrementally updated by batching updates to the data cube. Finally, w e present algorithms for choosing the subset of the data cube dimensions for which the auxiliary information is computed and the blocking factor to use for each such subset. Our approach to answering range-max queries is based on precomputed max over balanced hierarchical tree structures. We use a branch-and-bound-like procedure to speed up the nding of max in a region. We also show that with a branchand-bound procedure, the average-case complexity i s m uch smaller than the worst-case complexity.
doi:10.1145/253262.253274 fatcat:bjbyxr4atrdovaj6utbx3lbhli