Adaptive algorithms for set containment joins

Sergey Melnik, Hector Garcia-Molina
2003 ACM Transactions on Database Systems  
A set containment join is a join between set-valued attributes of two relations, whose join condition is specified using the subset (⊆) operator. Set containment joins are deployed in many database applications, even those that do not support set-valued attributes. In this paper, we propose two novel partitioning algorithms, called the Adaptive Pick-and-Sweep Join (APSJ) and the Adaptive Divide-and-Conquer Join (ADCJ), which allow computing set containment joins efficiently. We show that APSJ
more » ... We show that APSJ outperforms previously suggested algorithms for many data sets, often by an order of magnitude. We present a detailed analysis of the algorithms and study their performance on real and synthetic data using an implemented testbed.
doi:10.1145/762471.762474 fatcat:bdyptxnxazejziiv3hdjn4fimq