An array-based algorithm for simultaneous multidimensional aggregates

Yihong Zhao, Prasad M. Deshpande, Jeffrey F. Naughton
1997 SIGMOD record  
Computing multiple related group-bys and aggregates is one of the core operations of On-Line Analytical Processing (OLAP) applications. Recently, Gray et al. [GBLP95] proposed the "Cube" operator, which computes group-by aggregations over all possible subsets of the specified dimensions. The rapid acceptance of the importance of this operator has led to a variant of the Cube being proposed for the SQL standard. Several efficient algorithms for Relational OLAP (ROLAP) have been developed to
more » ... te the Cube. However, to our knowledge there is nothing in the literature on how to compute the Cube for Multidimensional OLAP (MOLAP) systems, which store their data in sparse arrays rather than in tables. In this paper, we present a MOLAP algorithm to compute the Cube, and compare it to a leading ROLAP algorithm. The comparison between the two is interesting, since although they are computing the same function, one is value-based (the ROLAP algorithm) whereas the other is position-based (the MOLAP algorithm.) Our tests show that, given appropriate compression techniques, the MOLAP algorithm is significantly faster than the RO-LAP algorithm. In fact, the difference is so pronounced that this MOLAP algorithm may be useful for ROLAP systems as well as MOLAP systems, since in many cases, instead of cubing a table directly, it is faster to tist convert the table to an array, cube the array, then convert the result back to a table. 2. Using the grouping performed on behalf of one of the sub-aggregates as a partial grouping to speed the computation another sub-aggregate, and 3. To compute an aggregate from another aggregate, rather than from the (presumably much larger) base table. By contrast, MOLAP systems (for example, Essbase from Arbor Software [CCS93, RJ, AS], Express from Oracle [OC],
doi:10.1145/253262.253288 fatcat:hwxx2rrvlfds3mfnqg4gipyr6y