Efficient aggregation algorithms on very large compressed data warehouses

Jianzhong Li, Yingshu Li, Jaideep Srivastava
2000 Journal of Computer Science and Technology  
Multidimensional aggregation and Cube are dominant operations for on-line analytical processing (OLAP). Many efficient algorithms to compute multidimensional aggregation and Cube for relational OLAP have been developed. Some work has been done on how to efficiently compute the Cube on data warehouses which store multidimensional datasets in arrays rather than in tables. However, to our knowledge, there is nothing to date in the literature on aggregation algorithms on compressed data warehouses
more » ... or multidimensional OLAP. This paper presents a set of aggregation algorithms on very large compressed data warehouses for multidimensional OLAP. These algorithms operate directly on compressed datasets without the need to first decompress them. They are applicable to data warehouses that are compressed using variety of data compression methods. The algorithms have different performance behavior as a function of dataset parameters, sizes of outputs and main memory availability. The algorithms are described and analyzed with respect to the I/O and CPU costs. A decision procedure to select the most efficient algorithm, given an aggregation request, is also proposed. The analysis and experimental results show that the algorithms have better performance than the traditional aggregation algorithms.
doi:10.1007/bf02948809 fatcat:ja2wykdfh5g5hhchx37tzciena