Multi-dimensional data statistics for columnar in-memory databases

Curtis Kroetsch
2014 Proceedings of the 2014 ACM SIGMOD international conference on Management of data - SIGMOD '14  
The research presented here studies the multi-dimensional data statistics in the context of columnar in-memory database systems. Such systems, for example SAP HANA [4], SQL Server Apollo, or IBM BLU, use an order-preserving dictionary with dense encoding on the read-optimized storage which encodes the values of a single column in an ordered, dense-domain dictionary. The dictionary maps variablelength domain values to fixed-size dictionary entries. This encoding reduces memory consumption as
more » ... ionary values can be represented with fewer bits than the original values, and allows queries to be evaluated efficiently on the encoded data. The main characteristics of the dictionary encoding is that it results in a dense domain of values which can be exploited for building efficient data statistics objects.
doi:10.1145/2588555.2612663 dblp:conf/sigmod/Kroetsch14 fatcat:iagxzzyr7jam5gzvhh27fxujjm