On Producing High and Early Result Throughput in Multijoin Query Plans

Justin K. Levandoski, Mohamed E. Khalefa, Mohamed F. Mokbel
2011 IEEE Transactions on Knowledge and Data Engineering  
This paper introduces an efficient framework for producing high and early result throughput in multi-join query plans. While most previous research focuses on optimizing for cases involving a single join operator, this work takes a radical step by addressing query plans with multiple join operators. The proposed framework consists of two main methods, a flush algorithm and operator state manager. The framework assumes a symmetric hash join, a common method for producing early results, when
more » ... results, when processing incoming data. In this way, our methods can be applied to a group of previous join operators (optimized for single-join queries) when taking part in multi-join query plans. Specifically, our framework can be applied by (1) employing a new flushing policy to write in-memory data to disk, once memory allotment is exhausted, in a way that helps increase the probability of producing early result throughput in multi-join queries, and (2) employing a state manager that adaptively switches operators in the plan between joining in-memory data and disk-resident data in order to positively affect the early result throughput. Extensive experimental results show that the proposed methods outperform the state-of-the-art join operators optimized for both single and multi-join query plans.
doi:10.1109/tkde.2010.182 fatcat:gmbbuthiyrcmhozkyrw6fdb5ce