Cross Table Cubing: Mining Iceberg Cubes from Data Warehouses [chapter]

Moonjung Cho, Jian Pei, David W. Cheung
2005 Proceedings of the 2005 SIAM International Conference on Data Mining  
All of the existing (iceberg) cube computation algorithms assume that the data is stored in a single base table, however, in practice, a data warehouse is often organized in a schema of multiple tables, such as star schema and snowflake schema. In terms of both computation time and space, materializing a universal base table by joining multiple tables is often very expensive or even unaffordable in real data warehouses. In this paper, we investigate the problem of computing iceberg cubes from
more » ... ta warehouses. Surprisingly, our study shows that computing iceberg cube from multiple tables directly can be even more efficient in both space and runtime than computing from a materialized universal base table. We develop an efficient algorithm, CTC (for Cross Table Cubing) to tackle the problem. An extensive performance study on synthetic data sets demonstrates that our new approach is efficient and scalable for large data warehouses. *
doi:10.1137/1.9781611972757.41 dblp:conf/sdm/PeiCC05 fatcat:andsaf24ufdp7hh5rdb575uyv4